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Abstract 

The  focus  of  this  research  was  to  develop  a  joint  pupil  and  focal  plane  image 
recovery  algorithm  for  use  with  coherent  LADAR  systems.  The  benefits  of  such  a 
system  would  include  increased  resolution  with  little  or  no  increase  in  system  weight 
and  volume  as  well  as  allowing  for  operation  in  the  absence  of  natural  light  since  the 
target  of  interest  would  be  actively  illuminated.  Since  a  pupil  plane  collection  aperture 
can  be  conformal,  such  a  system  would  also  potentially  allow  for  the  formation  of  large 
synthetic  apertures. 

The  algorithm  developed  used  many  frames  of  coherent  pupil  and  focal  plane 
data.  The  data  frames  are  summed  in  the  respective  planes  to  give  two  data  sets 
(one  for  each  plane).  Appropriate  statistical  models  are  used  and  a  joint  Maximum 
Likelihood  estimator  is  formed.  The  algorithm  is  tested  using  a  Monte  Carlo  approach. 
The  system  is  demonstrated  to  be  robust  and  in  all  but  extreme  cases  yields  better 
results  than  algorithms  using  a  single  data  set  (such  as  deconvolution).  It  was  shown 
that  the  joint  algorithm  had  a  resolution  increase  of  70%  over  deconvolution  alone 
and  a  40%  increase  over  traditional  pupil  plane  algorithms.  It  was  also  demonstrated 
that  the  new  algorithm  does  not  suffer  as  severely  from  stagnation  problems  typical 
with  pupil  plane  algorithms.  A  stopping  criteria  based  on  the  statistics  of  the  data 
was  also  developed. 
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Joint  Image  and  Pupil  Plane  Reconstruction  Algorithm 
based  on  Bayesian  Techniques 

I.  Introduction 

The  focus  of  this  research  is  to  explore  the  problem  of  coherent  image  recon¬ 
struction  using  large  synthetic  aperture  arrays.  The  benefits  would  include  a  large 
increase  in  system  resolution  for  a  small  increase  in  system  weight  and  volume.  This 
type  of  imaging  system  would  also  allow  for  conformal  optical  detectors  to  be  placed 
on  existing  platforms.  Furthermore  since  the  proposed  system  is  an  active  imaging 
system,  it  would  be  able  to  be  used  when  natural  light  was  not  available  and  would 
provide  higher  theoretical  resolution  than  traditional  IR  systems. 

1.1  Problem  Definition 

The  two  primary  limiting  factors  for  optical  resolution  arc  limited  aperture  size 
and  phase  error  in  the  propagation  paths.  The  phase  error  can  be  broken  into  two 
parts:  1)  system  static  aberrations  and  2)  turbulence  in  the  propagation  path.  There 
are  many  methods  for  characterizing  and  dealing  with  static  system  aberrations;  we 
will  propose  a  system  design  to  mitigate  the  other  two  problems. 

Since  it  is  known  that  the  aperture  of  an  imaging  system  limits  spatial  reso¬ 
lution,  it  is  desirable  to  form  larger  apertures.  Large  apertures  have  disadvantages 
as  well;  they  are  more  difficult  to  manufacture,  and  due  to  weight,  and  size  ,  they 
are  not  feasible  for  space-based  applications  [10].  By  taking  advantage  of  the  coher¬ 
ence  properties  of  laser  light,  it  is  possible  to  form  a  synthetic  aperture  array  from 
many  smaller,  monolithic  apertures.  By  doing  this,  one  can  expect  to  obtain  higher 
spatial  resolution  than  can  be  produced  from  existing  monolithic  apertures.  Since 
it  is  difficult  to  recover  absolute  phase  of  an  optical  held,  it  is  desirable  to  form  the 
synthetic  aperture  without  interfering  light  from  the  subapertures  [11].  To  accom¬ 
plish  this,  a  joint  estimation  algorithm  using  both  pupil  and  image  plane  data  will 
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be  formed.  The  pupil  plane  data  will  be  collected  by  the  synthetic  aperture  while 
the  image  plane  data  will  be  collected  from  a  reasonable  sized  monolithic  aperture. 
The  pupil  plane  estimation  algorithm  will  be  based  on  the  correlography  methods  of 
Fienup  and  Idcll  [21],  while  the  image  plane  algorithm  will  be  derived  from  the  de- 
convolution  methods  demonstrated  by  MacDonald  [24] .  These  methods  are  combined 
in  this  research  to  form  an  improved  image  retrieval  algorithm. 

1.2  Previous  Image  recovery  work 

1.2.1  Deconvolution.  Deconvolution  is  the  process  of  estimating  an  unknown 
function,  f(x),  from  a  noisy  measurement  of  the  convolution  of  f(x)  with  a  known 
function  h(x).  If  we  have  no  knowledge  of  h(x)  it  must  also  be  estimated  and  the 
problem  becomes  blind  deconvolution. 

1.2.2  Phase  Retrieval/Imaging  Correlography.  The  term  phase  retrieval  is 
used  to  describe  any  method  of  forming  an  image  from  the  Fourier  Modulus  of  the 
object  field.  The  foundational  work  on  phase  retrieval  was  done  by  J.R.  Fienup  [7]. 
In  this  paper  two  iterative  methods  to  recover  an  object  from  Fourier  modulus  data 
are  shown;  both  methods  are  derived  from  the  Gerchberg-Saxon  algorithm  [17].  The 
first  of  these  methods  is  known  as  the  error  reduction  approach  and  is  shown  in 
Figure  1.1a.  In  the  error  reduction  approach  the  algorithm  begins  with  a  random 
guess  of  the  object  brightness  function.  This  initial  estimate  is  Fourier  transformed 
and  the  modulus  of  the  transform  is  replaced  with  the  measured  pupil  plane  data, 
which  is  then  inverse  Fourier  transformed  to  form  a  new  estimate.  The  new  estimate  is 
forced  to  comply  with  any  known  constraints  in  the  object  domain.  These  constraints 
include  non-negativity  of  the  object  and  any  known  object  support.  The  term  ’’object 
support”  is  used  to  refer  to  any  area  where  the  object  is  known  to  be  non-zero.  Since 
the  Fourier  domain  data  is  related  to  the  object  autocorrelation,  the  diameter  of 
the  object  support  is  related  to  the  diameter  of  the  Fourier  domain  data  [5,12];  the 
autocorrelation  support  will  not  give  a  unique  solution  for  the  object  support,  but 
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Figure  1.1:  Fienup’s  phase  retrieval  algorithms  (a)  error  reduction  (b)  input-output 
method  [7] 

the  union  of  all  possible  object  support  sets  can  be  found.  This  algorithm  is  run 
for  a  set  number  of  iterations  before  exiting.  In  Figure  1.1b  we  see  the  input-output 
approach.  This  method  will  not  be  discussed,  since  the  differences  are  minor  and 
unimportant  in  this  discussion.  Other  methods  of  phase  retrieval,  such  as  various 
gradient  methods  [8]  and  deconvolution  techniques  [33] ,  can  be  found  in  the  literature. 
Phase  retrieval  algorithms  will  often  suffer  from  stagnation  or  uniqueness  problems, 
however,  efforts  have  been  made  to  avoid  these  problems  [4,15]. 

Another  method  of  forming  images  from  pupil  plane  data  is  imaging  correlog- 
raphy.  This  technique  takes  advantage  of  the  fact  that  the  autocorrelation  of  the 
object  brightness  function  and  the  squared  modulus  of  the  Fourier  transform  of  the 
object  form  a  Fourier  transform  pair  [2],  An  iterative  method  of  recovering  images 
from  correlations  was  shown  by  Schulz  [32],  Again  the  details  of  Schulz’s  work  are 
unimportant  to  this  work  other  than  to  demonstrate  that  images  can  be  recovered 
from  autocorrelations  with  some  degree  of  success. 

Both  of  the  above  techniques  benefit  from  an  estimate  of  the  object  support. 
An  algorithm  that  can  directly  measure  a  low  resolution  image  and  the  pupil  plane 
intensity  was  proposed  by  Fienup  [14],  It  is  important  to  note  he  did  not  derive  a 
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joint  estimator  from  both  data  sets,  rather  he  used  the  image  as  a  support  constraint, 
and  the  technique  offered  no  method  for  dealing  with  atmospheric  turbulence. 

The  data  model  this  research  uses  is  found  in  work  by  Idell  [21].  The  model 
show  that  the  average  of  many  realizations  of  the  squared  modulus  of  a  speckled 
autocorrelation  will  converge  (M  convergence)  to  the  convolution  of  the  true  autocor¬ 
relation  and  the  PSF  plus  a  dc  term.  Idell  used  this  model  and  classic  phase  retrieval 
techniques  to  recover  an  image.  This  research  differs  in  that  it  will  use  the  same  data 
model,  but  will  form  a  maximum  likelihood  (ML)  estimator. 

1.2.3  Prior  Synthetic  Aperture  efforts.  The  method  of  image  formation 
this  research  will  propose  is  a  lensless  imaging  technique;  this  means  an  image  is 
not  formed  by  the  optics,  but  rather  is  calculated  from  the  non-imaged  pupil  plane 
data.  Imaging  correlography  is  one  method  of  doing  this  for  incoherent  light  [13]. 
To  form  lenslcss  coherent  images  other  techniques  must  be  employed.  If  the  field 
in  the  pupil  plane  is  measured  using  heterodyne  detection  this  is  a  trivial  problem 
of  numerically  propagating  the  field  and  forming  a  simulated  image  in  a  computer; 
however,  heterodyne  detection  at  optical  wavelengths  is  challenging  and  has  yet  to 
be  proven  feasible  for  this  application.  For  this  reason  other  methods  must  be  used. 
Fienup  has  proposed  applying  his  phase  retrieval  algorithm  along  with  a  shaped  illu¬ 
mination  constraint  to  this  problem  [11],  This  technique  has  been  shown  to  work  well 
with  sharp-edged  illumination  patterns,  but  not  as  well  for  soft-edged  illumination 
patterns  [27].  The  difficulty  is  that  sharp-edged  illumination  patterns  require  large 
projection  optics.  If  the  large  optics  are  available  for  beam  projection  they  can  also 
be  used  for  imaging,  neglecting  the  benefits  of  the  lensless  array. 

This  work  is  a  continuation  of  initial  studies  done  by  Cain  [3].  The  work  by  Dr. 
Cain  was  a  proof  of  concept  that  includes  simplifications  which  will  be  removed.  The 
differences  in  this  work  include,  but  are  not  limited  to,  using  a  more  complete  data 
model  for  the  pupil  plane  data,  and  using  a  more  accurate  model  for  image  plane 
statistics. 
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1 . 3  Significant  Contributions 

This  work  resulted  in  the  following  significant  contributions: 

1.  Establish  a  maximum  likelihood  phase  retrieval  algorithm 

2.  Establish  a  joint  estimation  algorithm  that  is  appropriate  for  use  with  coherent 
LADAR  imagery 

3.  Demonstrate  the  algorithms  are  less  sensitive  to  atmospheric  turbulence  that 
other  methods  of  image  recovery 

4.  Establish  a  new  stopping/damping  criteria  for  use  with  existing  deconvolution 
algorithms  which  will  avoid  noise  amplification. 

1.4  Document  Outline 

The  remainder  of  this  document  is  organized  as  follows.  Chapter  2  will  discuss 
applicable  optical  theory  while  Chapter  3  will  discuss  estimation  theory.  In  Chapter  4 
a  review  of  deconvolution  is  included,  this  chapter  will  also  outline  a  statistics-based 
stopping  criteria.  Chapter  5  will  demonstrate  a  new  pupil  plane  algorithm.  Finally 
chapter  6  will  combine  the  algorithms  of  the  previous  two  chapters  and  demonstrate 
a  robust  joint  algorithm  for  image  recovery. 


5 


II.  Optical  Theory 

Active  imaging  techniques  rely  heavily  on  understanding  fundamental  properties  of 
light  including  propagation,  diffraction  and  coherence.  This  chapter  is  dedicated  to 
providing  an  overview  of  these  optical  properties.  Much  of  this  material  was  adapted 
from  [22,23], 

2.1  Fourier  Optics  and  Wave  Propagation 

This  section  is  devoted  to  giving  the  reader  an  understanding  of  Fourier  optics 
and  the  physics  of  wave  propagation.  It  will  begin  with  a  discussion  of  the  propagation 
of  monochromatic  light  and  will  conclude  with  a  brief  discussion  of  the  properties  of 
coherent  and  incoherent  illumination.  The  propagation  theory  is  drawn  from  [16,20]. 

2.1.1  Wave  Propagation  -  The  monochromatic  case.  Armed  with  the  knowl¬ 
edge  that  light  can  be  modeled  Electro-Magnetic  wave,  we  can  begin  to  formulate  a 
model  for  propagation.  We  can  write  an  expression  for  the  electric  field  (for  the  re¬ 
mainder  of  the  document  the  term  field  will  be  understood  to  mean  electric  field), 
u(£,r},t),  in  the  (£,77)  plane  as 

«(£,  V,  t)  =  A(£,  77)  cos[2vr vt  +  0(£,  77)]  (2.1) 

where  A(£,  77)  is  the  field  amplitude,  v  is  the  optical  frequency  of  the  field,  t  is  time, 
and  0(£,  77)  is  the  phase  at  position  (£,77).  For  simplicity  of  notation  we  will  define 
the  complex  amplitude,  U (£,  77)  of  the  field  as 

U  (£,  77)  =  A(£,77)e^  =  |u(£,  77)  |e^  (2.2) 

For  monochromatic  light  the  propagation  of  the  entire  field  can  be  accomplished  by 
adding  the  appropriate  phase  delay  to  each  point  and  summing  the  resultant  phasors 
in  the  (x,  y )  plane.  This  is  accomplished  using  the  Rayleigh- Sommerfeld  diffraction 


6 


integral 


(2.3) 


poo  poo  jkr 

U(x,y)  =  -^r  /  U{£,rj)—dnd£ 

J  A  y„oo  H 

where  z  is  the  normal  distance  between  the  input  and  output  propagation  planes,  A 

is  the  wavelength,  k—  and  r  is  defined  as 

r  =  yjjx  —  f)2  +  (y  —  r])2  +  z2  (2.4) 

2.  1.2  Fresnel  and  Fraunhoffer  Approximations.  The  Rayleigh- Sommerfeld 
formula,  Equation  2.3,  is  computationally  expensive  (order  iV4  for  &  N  by  N  ar¬ 
ray)  when  implemented  in  digital  simulations.  For  this  reason,  we  will  make  some 
simplifications  that  can  be  applied  provided  that  certain  criteria  are  satisfied. 

2. 1.2.1  Fresnel  Diffraction.  The  first  of  these  simplifications  is  the 
Fresnel  approximation  [20].  Using  the  Maclaurin  series  expansion  of  the  square  root 
given  by 

VT+b  =  l  +  \b-\b2  +  ...  (2.5) 

z  o 

we  can  rewrite  r  from  Equation  2.4 

r«z[l  +  t(ixl)2+t(^)2]  (2.6) 

eliminating  terms  of  higher  order  than  l(in  b).  By  taking  the  full  form  of  Equation  2.6 
in  the  exponential  term  and  only  the  first  term  in  the  denominator  term  Equation  2.3 
simplifies  to 

pjkz  poo  poo 

U(x,  y)  —  — —  /  /  U^,rj)eU(^2+(y-^dr,  (2.7) 

J  J _oc  J _0O 

By  expanding  the  quadratic  terms  in  the  exponential  this  equation  can  be  rewritten 
as 

U{x,  y )  =  Cj^e£(x2+y2)F2[U(f,  v)eJ^2W)]  (2.8) 
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where  JF2  is  the  2-D  Fourier  transform  defined  by 


/OO  /»oo 

/  U(Z,V)e-i2”W+^dZdri  (2.9) 

-oo  J — OO 

evaluated  at  —  and  fn  =  fz- 

This  approximation  is  valid  provided  the  observation  point  is  located  far  enough 
away  from  initial  plane.  This  distance  is  given  by  [20] 


z  3>  max 

(,x,y)£X0,(^v)^o 


O2  +  {y-  v)2]2 


(2.10) 


where  Xq  and  To  are  variables  in  M2  that  define  the  non- zero  regions  in  the  input  and 
output  planes  respectively.  The  output  plane  will  be  limited  by  an  aperture  function 
to  avoid  having  values  everywhere. 


2. 1.2. 2  Fraunhoffer  Diffraction.  The  propagation  integral  shown  in 
Equation  2.8  can  be  further  simplified  if  one  recognizes  that  the  quadratic  term  inside 
the  transfrom  is  approximately  1  for  large  z.  This  resulting  equation  is 

„jkz 

U(x,y)  =  (2.11) 


evaluated  at  fx  —  jf  and  fy  —  f-.  Which  means  the  output  held  is  a  simple  Fourier 
transform  of  the  input  held.  This  is  valid  when 


z  » 


max 


He + v2) 

2 


(2.12) 


It  is  desirable  whenever  possible  to  work  in  the  Fraunhoffer  region  dne  to  compu¬ 
tational  efficiency  (order  N2log2(N)  for  an  NxN  array  when  N  is  a  power  of  2  [28]) 
gained  by  using  the  Fourier  transform  and  the  fact  that  we  can  take  advantage  of 
many  properties  of  the  Fourier  transform. 


RMS  Roughness 
»  X 


Figure  2.1:  Illustration  of  the  phase  aberration  induced  when  coherent  light  is  re¬ 
flected  from  a  optically  rough  surface 

2.2  Speckle  Statistics 

The  speckled  appearance  of  coherent  light  scattered  from  a  rough  surface  is 
a  well  known  phenomenon.  This  speckle  is  caused  by  interference  patterns  in  the 
detector  plane.  Since  the  surface  roughness  is  random,  so  is  the  speckle  pattern. 
When  coherent  light  is  reflected  from  a  rough  surface,  the  shape  of  the  wavefront  is 
changed  in  a  random  manner  (see  Figure  2.1).  The  wavefront  deformation  is  modeled 
as  a  uniform  random  phase  (</>  G  ( — 7r,  7 r])  added  to  the  wavefront.  Speckle  has  some 
useful  statistical  properties. 

2.2.1  First  Order  Statistics.  This  section  is  devoted  to  formulating  a  statis¬ 
tical  model  to  describe  the  intensity  and  phase  of  propagated  optical  fields. 

2. 2. 1.1  Assumptions.  All  statistical  models  begin  with  assumptions; 
in  our  model  phase  and  amplitude  are  considered  random  variables.  First  we  assume 
the  target  is  optically  rough;  “optically  rough”  surfaces  have  roughness  (depth  and 
breadth)  on  the  order  of  the  wavelength  of  the  light.  The  roughness  of  these  surfaces 
is  modeled  as  a  uniform  random  variable  from  [—7 r,  7r].  Second,  we  assume  the  target 
is  illuminated  by  a  plane  wave;  this  assumption  is  not  necessary  but  will  simplify 
our  math  without  a  loss  of  generality.  Third,  we  assume  the  phase  and  amplitude 
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of  our  field  are  independent  of  the  other.  Finally,  we  add  the  assumption  that  the 
phases  and  amplitudes  of  all  contributing  components  are  independent  and  identically 
distributed. 


2.2. 1.2  Development.  The  field  at  the  receiver  pupil  is  modeled  as  a 
linear  combination  off  all  the  points  in  the  target  field: 

1  N 

A  =  —=  ^  rpr'V'0  =  aeje  (2.13) 

■  i= 1 

where  A=  is  the  magnitude  of  each  field  component,  6f;  is  the  phase  caused  by  the 
surface  roughness,  and  fa  is  the  phase  associated  with  the  atmospheric  propagation. 
By  writing  the  sum  in  this  form,  and  including  the  scale  factor  of  -)=  we  can  use 
random  phasor  sums  to  proceed  with  the  development.  We  are  only  concerned  with 
the  sum  of  these  two  phases,  which  when  phase  is  represented  modulo  2n,  can  be  seen 
to  possess  a  uniform  distribution  over  [ — 7r,  tt]  .  This  is  the  same  as  the  distribution 
on  9  so  we  will  drop  cj)  for  convenience.  A  is  complex  can  also  be  written  as 

A  =  a+j/3  (2.14) 


As  N  approached  infinity,  we  can  apply  the  central  limit  theorem  and  assume  a  and 
f3  are  Gaussian.  Therefore  if  we  can  calculate  the  mean  and  variance  of  a  and  /3  and 
their  correlation  coefficient,  p,  we  can  write  the  joint  probability  distribution  function 
for  a  and  /3.  We  begin  by  finding  the  mean  values 


E[a 


1 

y/N 


N 

^2,E[ai]  E  [cos  9j] 

i= 1 


o 


(2.15) 


since  the  expectation  is  taken  over  Q{  e  [ — 7r,  7r] .  By  the  same  argument  (3  is  also  zero 
mean.  The  variances  are  calculated  as  follows: 


E[(a  -  a)2}  =  E[a2} 


(2.16) 
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(2.17) 


^  JV  JV 

*M  =  v££  E[a,iak  cos  9i  cos  9k] 

i= 1  k=  1 

Now  for  each  i,  choose  q  G  M  such  that  9k  =  then 

^  N  N 

=  *££*[  aia/c]-E'[cos  6*j  cos  (2-18) 

i=l  fc=l 


^  Af  iV 

£(“2)  =  v  £  £  E[aiak]E[cos6iCosCi9i\  (2-19) 

i= 1  fc=l 

using  trigonometric  identities  can  be  written  as 


N  N 

£i“2]  =  ^££Bi“-“‘]B 

i= 1  fc=  1 

The  second  expectation  is  easily  shown  to  be 


cos(ftj  -  CjOi )  +  cos(6*j  +  c^) 
2 


(2.20) 


E 


cos  (9i 


Ci@i)  +  COS (Qi  ~\~  Cfli) 
2 


|  Cj  =  1  — >  i  =  k 
0  otherwise 


(2.21) 


which  yields  a  final  result  of 

£[a2]  =  Ml 


(2.22) 


In  a  similar  fashion  we  can  show  that  a  and  /3  are  uncorrelated.  The  values  calculated 
above  allow  us  to  write  the  joint  pdf  of  a  and  j3 


PaAa’P) 


exp 


a2  +  (32\ 

MIT  J 


(2.23) 


We  have  come  up  with  a  joint  pdf  of  the  real  and  imaginary  parts  of  the  held 
in  the  pupil,  but  we  want  a  marginal  pdf  on  intensity,  and  if  possible  a  marginal  pdf 
for  the  phase.  The  next  step  to  get  there  will  be  to  transform  pa^(a,  (3)  to  a  pdf  on 
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amplitude  and  phase.  The  resulting  joint  pdf  is 


PA,e{a,0) 


nE[a%] 


(2.24) 


by  integrating  out  the  other  variable  we  can  get  marginal  pdfs  on  phase  and  amplitude 

On  a 2, 

PAifl)  =  (2.25) 

E[ai\ 

Pe(D)  =  T  (2.26) 

Finally  we  can  hnd  the  pdf  on  intensity  by  noting  that  /  =  A2  and  doing  another  pdf 
transformation  which  yields 

Pi(i)  =  2^2 e~ ^  (2-27) 

This  is  a  negative  exponential  distribution  and  has  the  property  that  E[i]  =  2 a2  which 
allow  us  to  rewrite  the  pdf  as 


Pi(i) 


(2.28) 


The  above  argument  applies  to  the  pdf  of  instantaneous  intensity,  but  since  all  de¬ 
tectors  have  a  finite  integration  time  we  are  interested  in  the  statistics  of  integrated 
intensity.  The  reader  is  referred  to  chapter  6  of  reference  [19]  for  a  more  complete 
development.  The  resulting  pdf  of  integrated  intensity  is 


'  M 

Pw(W)  =  [  — 


M 


WM 


-1 


exp 


r  (M) 


(2.29) 


where  W  is  the  integrated  intensity,  W  is  the  expected  value  of  the  mean  intensity,  and 
A4  is  the  speckle  parameter  of  the  light.  By  recognizing  that  there  will  be  detection 
noise  we  can  transform  this  PDF  one  final  time.  The  detection  noise  is  poisson  and 
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the  final  pdf  for  the  number  of  photocounts  in  the  integration  time  is  given  below: 


P(K) 


r  (k  +  M) 
r(A^  +  i)r(Ai) 


„  M 

—  K 

K 

1  + 

1  +  — 

A 

M 

(2.30) 


where  T  is  the  well  known  T-function.  This  is  the  negative  binomial  distribution  and 
accurately  models  photocount  statistics  [19]. 


2.3  Coherent  Imaging 

This  section  will  develop  a  model  of  coherent  imaging  and  contrast  it  with 
incoherent  imaging.  An  understanding  of  linear  systems  will  be  needed  and  will 
therefore  be  provided  up  front. 

2.3.1  Linear  Systems  Review.  In  order  to  understand  linear  system  analysis 
we  must  first  define  what  we  mean  by  a  system.  A  system  is  defined  as  a  mapping  of  a 
set  of  input  functions  to  a  set  of  output  functions  [20] .  For  optical  imaging  problems 
the  set  of  input  and  output  functions  can  represent  either  real-valued  intensity  or 
complex- valued  field  amplitude;  in  either  case  the  functions  are  defined  in  2-D  variable 
space. 

A  convenient  way  to  represent  a  system  is  as  a  mathematical  operator,  5{}, 
which  will  operate  on  a  2-D  function  to  produce  an  output  that  is  also  a  2-D  function 

g2(x,y)  =  S{gi(£,r})}  (2.31) 

It  is  important  to  note  that  this  relation  can  be  many  to  one  or  one  to  one,  but  since 
for  now  we  will  limit  our  scope  to  deterministic  systems  it  can  not  be  a  one  to  many 
mapping. 

Now  that  we  have  defined  a  system,  we  must  proceed  to  define  the  more  re¬ 
strictive  case  of  linear  systems.  A  system  is  referred  to  as  linear  if  the  superposition 
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principle  is  obeyed  for  all  input  functions  p  and  q  and  all  complex  constants  a  and  b 


S{ap  +  bq}  =  aS{p}  +  bS{q}  (2.32) 

The  advantage  of  linearity  is  the  ability  to  express  the  output  of  a  system  in  terms  of 
a  sum  of  ’’decomposed”  inputs.  To  further  this  idea  we  can  look  at  the  response  of  a 
system  to  a  displaced  delta  function,  S(x i  —  £,  7/1  —  rj) 

h{x,y,Z,rj)  =  S{6(x-£,y-rl 7)}  (2.33) 

The  function  h  is  referred  to  as  the  impulse  response  of  the  system.  This  allows  us  to 
relate  the  input  and  output  by 

/oo  roo 

/  9i(Z,v)h(x,y,Z,r})d£dri  (2.34) 

-OO  J  —  OO 

From  here  we  will  further  restrict  our  interest  to  linear  shift-invariant  systems.  A 
shift-invariant  system  is  a  system  whose  impulse  response  is  only  dependent  on  the 
separation  of  points  in  space  and  not  the  points  themselves.  For  such  systems  we  can 
write  the  impulse  response  as 

h(x,  y\ £,  rj)  =  h(x  -  f ,  y  -  rj)  (2.35) 

Using  this  result  we  rewrite  the  Equation  2.36  as 

/OO  /*oo 

/  9i{Z,v)h{x-t,y~"n)d£dri  (2.36) 

00  J  —  00 

which  is  recognized  to  be  a  2-D  convolution  of  the  input  function  with  the  impulse 
response.  For  convenience  in  later  chapters  we  define  short  hand  notation  for  convo¬ 
lution  as 

g2{x,y)  =  \g1*h](x,y)  (2.37) 
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2.3.2  Imaging  -  A  linear  systems  approach.  Equation  2.7  can  be  rewritten 
as 

/OO  POO 

/  h(x  -£,y-  V )U0(€,  rj)d£dig  (2.38) 

-OO  J  — OO 

where  Ui  is  the  held  in  the  image  plane,  UQ  can  be  thought  of  as  a  geometric  prediction 
of  the  held  in  the  image  plane  and  h(u,  v )  =  jAe^u2+v‘2^ .  This  is  easily  recognized  to 
be  a  convolution.  By  writing  the  propagation  process  as  a  convolution  we  can  infer 
that  the  process  is  linear  in  complex  held  and  space  invariant  [16].  For  a  coherent 
system  the  instantaneous  intensity  in  the  receiver  plane  can  be  found  by 

/oo  poo  2 

/  h(x  -£,y-  rj)U0{£,  rj)d£dig  (2.39) 

-OO  J  —OO 

As  mentioned  before,  all  detectors  integrate  for  a  time  period  and  therefore  average 
resulting  in 

/OO  POO  POO  POO 

/  /  /  E[Uo(^i,iii)U*(^2,V2)}h(x-^i,y-Vi)h*(x-^2,y-V2)d^idi]1d^2di]2 

oo  J  —  oo  J  —  oo  J  —oo 

(2.40) 

where  the  expectation  is  taken  over  the  phase  of  UQ.  However  we  know  from  above 
that  for  a  single  speckle  realization  this  is  a  deterministic  value  and  the  expectation 
can  be  dropped. 

2.3.2. 1  Incoherent  Imaging  Systems.  It  is  necessary  to  take  a  brief 
look  at  incoherent  imaging  systems.  The  expectation  in  Equation  2.40  represents  the 
mutual  intensity.  We  have  used  the  time  average  and  ensemble  averages  interchange¬ 
ably  here  since  the  random  parameter  is  a  time  varying  phase  value.  For  incoherent 
light  this  mutual  intensity  is 

E[U0(^,  m)U:^2,V2)\  =  <$(£i  -  6,  Vi  -  V2)Io(Zi,V i)  (2.41) 
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Using  this  result  along  with  Equation  2.40  we  are  able  to  write  an  expression  relating 
the  the  object  and  image  intensities 

/OO  /*oo 

/  s(x  ■~£,y~ri)I0(Z,v)  (2-42) 

-OO  J  —  OO 

where  the  point  spread  function,  s,  is  given  by 

s(x,y)  =  \h(x,y)\2  (2.43) 

It  is  important  to  recall  that  I0  is  not  the  true  intensity  distribution  of  the  object, 
but  rather  a  geometric  image  of  the  object  in  the  image  plane. 

2. 3. 2. 2  Multiframe  Coherent  Imaging.  Incoherent  imaging  is  impor¬ 
tant  due  to  the  fact  that  as  many  independent  speckle  realizations  are  imaged  and 
summed  in  a  coherent  system  the  result  approaches  that  predicted  by  an  incoherent 
system  [21]. 

2-4  Imaging  through  turbulence 

The  above  propagation  theory  applies  to  light  propagating  in  a  medium  with 
uniform  index  of  refraction.  In  the  atmosphere  this  is  not  the  case.  The  consequence 
of  this  is  that  atmospheric  turbulence  becomes  the  limiting  factor  for  resolution  for 
most  optical  systems  that  require  a  long  propagation  through  the  atmosphere.  In 
long  exposure  imaging,  the  point  spread  function  (PSF)  of  the  imaging  system  is  very 
broad  and  smooth;  while  in  short  exposure  imaging,  known  as  speckle  imaging,  the 
PSF  is  not  quite  as  broad  but  suffers  from  a  modulated  (speckled)  irradiance  pattern. 
In  either  of  these  two  cases  angular  resolution  is  severely  limited  [31]. 

Turbulence  effects  result  from  random  spatial  and  temporal  fluctuations  in  in¬ 
dex  of  refraction  in  the  atmosphere,  which  in  turn  cause  a  random  variation  in  optical 
path  length  (OPL).  These  variations  in  OPL  result  in  phase  abberations  on  the  wave- 
front,  which  in  turn  become  intensity  variations  after  the  wave  has  propagated.  Since 
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atmospheric  turbulence  is  not  easily  modeled  as  a  deterministic  process,  statistical 
models  are  required  to  understand  and  model  these  effects.  The  first  of  these  models 
was  created  in  the  1940’s  by  A.N.  Kolmogorov. 

The  atmosphere  can  be  considered  a  viscous  fluid,  and  therefore  it  has  two 
distinct  states  of  motion  -  laminar  and  turbulent.  The  distinction  between  these  two 
states  is  that  laminar  flow  is  smooth  and  regular  while  turbulent  flow  is  unstable  and 
acquires  random  subflows  called  turbulent  eddies.  The  separation  between  these  two 
regimes  is  defined  by  the  Reynolds  number: 

Re  =  1^-  (2.44) 

r£v 

where  vavg  is  the  average  air  velocity,  /  is  the  scale  size,  and  kv  is  the  viscosity  of  the 
air.  When  the  Reynolds  number  exceeds  some  critical  value  the  flow  is  said  to  be 
turbulent.  As  an  example,  the  viscosity  of  air  is  kv  =  1.5  x  lO”5^-,  and  assuming  a 
scale  size  of  /  =  10m  and  a  velocity  of  vavg  =  1  — ,  a  Reynolds  number  of  6.7  x  105 
is  found.  This  example  demonstrates  that  atmospheric  air  flow  is  essentially  always 
turbulent  [31]. 

In  Kolmogorov’s  theory,  he  suggested  the  structure  of  the  atmosphere,  for  large 
Reynolds  numbers,  was  homogenous  and  isotropic  within  the  inertial  subrange.  Inside 
the  inertial  subrange  the  atmosphere  is  comprised  of  eddies  that  interact  and  exchange 
energy  to  form  and  divide  into  smaller  eddies.  An  eddy  is  defined  as  a  pocket  of  air 
that  has  a  uniform  temperature  and  pressure  [31].  The  inertial  subrange  is  defined 
by  eddy  sizes  bounded  by  the  inner  scale,  /o,  and  the  outer  scale,  L0. 

Index  of  refraction  variations  in  the  atmosphere  result  from  temperature  inho¬ 
mogeneities  caused  by  turbulent  air  motion.  Since  temperature  fluctuations  are  a 
function  of  location  in  space,  R,  and  time,  t,  so  is  the  index  of  refraction: 

n(R,  t )  =  no  +  ni(R,  t),  (2-45) 
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where  no  ~  1  is  the  mean  value  of  the  index  of  refraction  and  n i  (R,  t)  is  the  deviation 
about  this  mean.  The  index  of  refraction  time  dependence  can  be  ignored  since  the 
rate  of  change  of  the  atmosphere  is  slow  when  compared  to  the  typical  timescales  of 
turbulence  moving  across  the  beam  (Taylors  Frozen  Flow)  [1].  Using  these  simplifi¬ 
cations,  the  index  of  refraction  can  be  represented  by 

n(R)  =  1  +  ni(R),  (2.46) 


Using  Equation  2.46,  it  is  possible  to  arrive  at  the  structure  function  describing  the 
index  of  refraction  variations  in  the  atmosphere. 

Because  it  is  not  possible  to  exactly  describe  the  index  of  refraction  random 
process  for  all  positions  in  space,  the  structure  function  is  necessary.  There  are 
too  many  random  behaviors  and  variables  to  account  for  in  a  closed  form  solution. 
The  index  can  only  be  described  in  reference  to  stationary  random  functions.  Over 
long  spatial  periods,  the  index  of  refraction  is  not  a  stationary  random  process,  but 
over  short  spatial  periods  of  interest  to  applications  of  laser  propagation,  the  index  is 
considered  to  have  stationary  increments  [1].  In  other  words,  it  is  possible  to  treat  the 
index  random  process  as  stationary  with  emphasis  on  the  function  n(R+Ri)  —  n  (Ri). 
Intuitively,  the  structure  function  of  the  index  of  refraction  is  the  mean  squared 
difference  between  the  index  of  refraction  at  one  point  in  space  and  the  index  at  a 
point  with  some  separation  distance  from  the  first  point.  The  structure  function  of 
n(R)  is  defined  by: 

A,.(Ri,R2)  =  (HRi)  -  n(R2)]2)  (2.47) 

where  Ri  and  R2  are  vectors  describing  points  in  space  and  (•)  denotes  the  ensemble 
average.  By  starting  with  the  structure  function  of  wind  velocity  Kolmogorov  was 
able  to  determine  the  structure  function  of  the  index  of  refraction  to  be: 


C,2i?2/3  ,  l0  «  R  <  L, 

C'2/0“4/3i?2  ,  R  <  l0 


(2.48) 
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where  l0  and  L0  are  the  inner  and  outer  scale  sizes,  R  =  |R2  —  Ri],  and  C\  is  the 
atmospheric  structure  constant.  At  small  scale  sizes  below  lQ,  the  structure  func¬ 
tion  follows  a  squared  relationship  (second  part  of  Equation  2.48)  which  is  found  by 
performing  a  Taylor  Series  expansion  on  the  structure  function  for  small  separation 
distances  [1].  The  structure  function  is  dependent  on  the  separation  distance  R  and 
has  units  of  radians  squared;  it  can  be  written  in  terms  of  the  atmospheric  Fried 
parameter: 

Dn(R)  =  6.88  (  -j  ,  (2.49) 

where  r0  relates  to  turbulence  strength  and  is  defined  and  discussed  in  a  later  section. 

The  structure  function  is  related  to  the  autocorrelation  function  of  the  index  of 
refraction,  Tn,  by: 

Dn{R)  =  2\Tn(0)-Tn(R)]  (2.50) 

Further  the  autocorrelation  function,  when  it  exist,  is  related  to  the  power  spectral 
density  (PSD)  ,  <3>n(/c)  by  the  Wiener- Khinchin  theorem  : 


T„(R)  =  /  $n(K)ej*RdK 


(2.51) 


From  this  relation  a  spectral  model  for  the  atmosphere  can  be  developed. 

The  statistical  distribution  of  size  and  number  of  turbulent  eddies  is  described 
by  the  PSD  of  n,  <3>n(/£)  where  k,  is  the  spatial  wavenumber  vector.  The  PSD  can  be 
thought  of  as  a  measure  of  the  relative  abundances  of  turbulent  eddies  at  a  given  scale 
size.  Using  the  assumption  that  the  index  of  refraction  is  homogenous  and  isotropic, 
the  PSD  can  be  written  as  a  function  of  the  scalar  wavenumber,  n  [31].  Kolmogorov’s 
theory  only  predicts  a  form  for  the  PSD  inside  the  inertial  subrange: 


$n(K,z)  =  0.033 C2n(z)K^ 


(2.52) 
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where  C^(z)  is  the  structure  constant  of  the  atmosphere  as  a  function  of  location  in 
the  propagation  path,  z. 

The  Kolmogorov  spectrum  is  not  valid  for  all  wavenumbers  so  a  more  complete 
model  is  required.  For  a  more  complete  model  the  modified  atmospheric  spectrum  will 
be  used  and  is  given  by  [1]: 


$„(«,  z)  =  0.033C'2(^)  [l  +  1.802  («/«,)  -  0.254  (k/kz)7/6 


exp^  —  K,2  / K,^ 
(«2+«|)11/6 


0  <  K  <  OO 


(2.53) 

where  K(  —  V2  and  kq  =  The  atmospheric  model  can  be  tailored  by  selecting 

1*0  J-'o 

appropriate  inner  scale,  outer  scale,  and  C2  values  depending  on  the  laser  beam 
propagation  scenario. 


The  strength  of  turbulence  in  the  atmosphere,  C'2,  depends  on  height  above 
ground  and  the  model  chosen.  Total  turbulence  strength  for  the  entire  path  is  found 
by  integrating  C%(z)  over  the  path  that  laser  light  would  travel  to  the  sensor.  To 
accomplish  this  task,  the  Hufnagel- Valley  (H-V)  model  is  chosen  describing  C' 2 .  Like 
the  modified  spectrum  for  the  atmosphere,  the  H-V  model  is  most  commonly  used 
for  generic  conditions  describing  C2,  as  it  is  based  on  real  data  of  various  seasons, 
altitudes,  and  geographic  locations  [1].  The  H-V  model  used  is 


(72( h)  =  0.00594(u/27)2(10_5h)10  exp(— h/1000)  +  2.7  x  KT16  exp(-h/1500) 


+Hexp(— h/100) 


(2.54) 


where  h  is  the  height  above  the  ground,  v  is  the  root-mean-square  wind  speed  in 
(m/s)  and  A  is  the  value  of  C2(0)  at  the  ground  in  m-2//3. 


2-4-1  Atmospheric  Parameters.  Three  atmospheric  parameters  are  often 
used  to  describe  turbulence  strength:  the  Fried  parameter,  r0;  the  isoplanatic  angle, 
6q ;  and  the  Rytov  variance,  g\.  Each  of  these  parameters  is  a  different  moment  of  C2 
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and  is  defined  below: 


3/5 


r0  =  1.67 


dn  = 


1-3/5 


2.91  k2 


!5/3C.2.( 


z)dz 


(2.55) 

(2.56) 


u i  =  2.25 k7/6  [  Cl(z)z5/6dz  (2.57) 

Jo 

where  z  =  c(^0  and  6Z  is  the  zenith  angle.  The  Fried  parameter  defines  the  roll  off 
of  the  OTF  of  the  atmosphere  [31];  another  way  of  saying  this  is  that  little  is  gained 
in  resolution  for  aperture  sizes  larger  than  r0.  In  this  research  r0  is  the  parameter  we 
will  use  to  describe  turbulence  strength. 


2.5  Phase  Screens 

For  long  propagations  through  a  non-uniform  media  it  is  necessary  to  have  a  way 
of  modeling  the  phase  perturbations  as  discrete  layers  of  phase  that  can  be  added  to 
the  unperturbed  wave.  This  type  of  model  is  called  a  phase  screen.  Depending  on  the 
effects  to  be  modeled  and  the  level  of  accuracy  required  one  phase  screen  may  or  may 
not  be  sufficient.  If  more  than  one  phase  screen  is  used  the  strength  of  each  screen 
must  be  adjusted  accordingly.  If  r0  is  used  to  define  the  strength  of  the  atmosphere 
then  each  phase  screen  can  be  assigned  a  strength  according  to 

N 

’T/3  =  £  T«'3  (2.58) 

i=  1 

where  is  the  Fried  parameter  of  the  individual  phase  screens  [31,  72], 

2.6  Phase  Screen  Creation 

Modeling  the  atmosphere  using  knowledge  of  the  scenario  and  power  spectrum 
allows  phase  screens  to  be  produced  to  represent  the  atmospheric  random  process. 
Several  methods  exist  for  producing  phase  screens  using  the  power  spectrum.  Two 
common  methods  involve  using  the  Zernike  polynomial  basis  set  to  produce  phase 
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screens  and  using  the  inverse  Fourier  transform  of  the  power  spectrum  with  Gaussian 
random  variables  to  produce  phase  screens.  One  disadvantage  in  the  Fourier  transform 
method  is  that  for  simulations  over  longer  time  periods,  an  increasingly  large  screen 
must  be  computed.  Additionally,  the  Fourier  transform  method  produces  screens  that 
lack  low  frequency  accuracy.  In  other  words,  the  modified  spectrum  to  be  modeled 
contains  a  large  percentage  of  power  in  the  low  frequency  components.  In  taking 
the  inverse  Fourier  transform,  these  low  frequency  regions  are  not  allocated  enough 
samples,  so  low  frequencies  are  under-represented.  To  alleviate  these  two  problems,  a 
modification  to  the  Fourier  transform  method  is  used  called  the  “generalized  Fourier 
series  method  ”  [26]. 

To  facilitate  understanding  the  Fourier  series  method,  the  power  spectrum  must 
be  discussed  in  relation  to  random  processes.  Intuitively,  the  power  spectrum  of  a 
random  process  is  the  average  amount  of  power  in  each  frequency  component  com¬ 
posing  the  random  process.  For  this  case,  the  random  process  is  phase  variation 
induced  by  the  atmosphere.  The  power  spectrum  is  related  to  the  covariance  of  the 
phase  variation,  Bn  (R)  by  the  Wiener- Khintchine  theorem  (recall  Equation  2.51). 

Starting  with  the  modified  power  spectrum  representing  the  atmosphere  (Equa¬ 
tion  2.53),  one  can  finely  sample  the  PSD  the  low  frequency  region  and  then  give 
fewer  samples  to  the  high  frequency  region.  In  this  way,  the  frequency  regions  that 
have  a  larger  power  are  sampled  more  often.  The  PSD  is  then  randomized  using 
Gaussian  variables  with  the  appropriate  variance.  The  result  is  an  array  of  complex 
coefficients  describing  the  frequency  composition  of  a  phase  screen  iteration.  The 
complex  coefficients  exhibit  circular  complex  Gaussian  statistics  with  a  variance  cor¬ 
responding  to  the  previously  sampled  PSD.  All  that  remains  is  to  sum  sinusoids  of 
corresponding  frequencies  to  produce  the  desired  phase  screen.  Implementing  this 
procedure  is  accomplished  by  the  inverse  Fourier  series  given  by 

OO  OO 

M*,y)=  E  E  c n,mej2^nX+fy-v\  (2.59) 
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where  (x,  y )  are  spatial  coordinates  of  the  screen,  cn  m  are  randomized  complex  co¬ 
efficients  from  the  PSD  of  interest,  and  (fXn,  fym)  are  spatial  frequency  components 
from  the  PSD.  Note  that  the  sum  is  calculated  rather  than  using  a  Fast  Fourier 
Transform  (FFT);  this  is  necessary  due  to  the  nonlinear  sampling  method.  By  ran¬ 
domizing  the  real  part  of  the  PSD  using  a  circular  complex  Gaussian  random  variable, 
complex  coefficients  cn>m  are  created  containing  a  random  phase.  Therefore,  each 
phase  screen  iteration  (f>k(x,y)  is  unique  and  possesses  a  unique  random  phase  in  the 
Fourier  domain. 

An  advantage  in  this  method  appears  for  applications  requiring  a  sequence  of 
screens  to  represent  longer  time  periods  (several  seconds).  Instead  of  calculating 
one  large  phase  screen  and  moving  the  area  of  interest  around  the  screen  as  time 
progresses,  it  is  only  necessary  to  calculate  the  screen  exactly  where  it  is  needed. 
Although  the  generalized  Fourier  series  method  cannot  take  advantage  of  fast  Fourier 
transform  algorithms,  calculations  are  still  saved  by  only  calculating  the  screen  area 
of  interest.  To  implement  the  Fourier  series  method,  a  PSD  for  the  turbulence  of 
interest  is  calculated  using  l0,  L0,  r0  and  the  spatial  frequency  region  of  concern.  The 
Fourier  series  coefficients  are  then  calculated  for  frequency  components  of  interest 
(calculating  more  low  frequency  components).  Afterward,  the  coefficients  are  used 
to  construct  the  phase  screen  by  summing  sinusoids  with  different  weights  at  any 
location  desired. 
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III.  Estimation  Thoery 

Estimation  theory  is  a  well  defined  method  for  inferring  values  of  unknown  parameters 
based  on  available  observations  which  are  corrupted  by  noise  or  incomplete.  The 
estimate  is  formulated  based  on  a  cost  function  that  weights  the  penalty  for  incorrect 
guesses.  The  basic  estimation  problem  has  four  main  components:  a  parameter  space, 
a  probabilistic  mapping,  an  observation  space  and  an  estimation  rule  (see  Figure  3.1). 

The  parameter  space  is  made  up  of  all  possible  parameters  values,  including  the 
set  that  corresponds  to  reality.  The  parameter  values  generated  by  the  estimation 
rule  are  also  contained  in  the  parameter  space  and  hopefully  he  ” close”  to  the  ’’true” 
data.  The  probabilistic  mapping  is  a  statistical  model  of  the  process  (or  processes) 
that  describes  how  uncertainty  is  incorporated  in  the  model  for  the  data.  The  map¬ 
ping  is  posed  in  the  form  of  probability  distribution  functions  of  the  data  given  the 
observations.  The  observation  space  is  composed  of  all  possible  observations  and  may 
or  may  not  intersect  the  parameter  space.  Finally,  the  estimation  rule  is  the  mapping 
used  to  map  elements  of  the  observation  space  to  elements  of  the  parameter  space. 

The  parameter  and  observation  spaces  as  well  as  the  probabilistic  mapping  are 
all  determined  by  the  problem  to  be  solved  and  cannot  be  changed.  The  task  for  the 
designer  is  to  develop  an  estimation  rule  that  gives  acceptable  results.  The  theory  on 
estimation  in  this  chapter  is  taken  from  [29, 34] 

3.1  Bayes  Estimation 

If  the  parameter  to  be  estimated,  a,  is  a  random  variable  one  could  choose 
Bayesian  estimation  as  a  means  to  develop  an  estimation  rule.  The  first  step  in  this 
estimation  procedure  it  to  define  a  cost  function.  The  cost  function  will  be  used  to 
weight  estimates,  a.  The  cost  function,  C(a,a(R))  (R  is  a  vector  of  observations),  is 
a  function  of  the  true  data  and  the  estimate;  however,  it  is  usually  sufficient  to  write 
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Probablistic 


Figure  3.1:  Diagram  showing  relationships  between  parameter  and  observation 
spaces  and  mapping/decision  rules 

the  cost  as  a  function  only  of  the  error  of  the  estimate,  ae 


ae  =  a(R)  —  a  (3.1) 

This  cost  function,  C(ae),  is  a  function  of  a  single  variable  and  is  more  convenient  to 
work  with. 

Three  common  cost  functions  are  the  squared  error,  the  absolute  error,  and  the 
uniform  cost  function.  The  squared  error  cost  function, 

C(ae)  =  a2e,  (3.2) 


clearly  emphasizes  large  errors.  The  absolute  error  cost  function, 


C(a€)  =  |ae| , 


(3.3) 
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is  directly  proportional  to  the  magnitude  of  the  error.  Finally  the  uniform  cost  func¬ 


tion, 


C(a.)  = 


\ 0 

ae 

ae 

<t 

>f 


(3.4) 


gives  no  weight  to  the  magnitude  of  the  error,  but  rather  gives  a  uniform  penalty  if 
the  error  surpasses  a  threshold,  A.  The  cost  function  is  chosen  by  the  designer  based 
on  the  problem;  for  example  a  tracking  system  might  chose  the  squared  error  cost 
function  to  penalize  extreme  errors,  while  a  targeting  system  would  choose  a  uniform 
cost  function  since  anything  outside  of  a  certain  error  is  unacceptable. 


When  using  Bayesian  estimation,  it  is  necessary  to  have  a  known  or  estimated 
a  priori  probability  distribution  (prior)  for  the  variable  to  be  estimated;  later  section 
will  discuss  how  to  deal  with  estimation  problems  when  the  prior  is  unknown.  Once 
we  have  selected  a  cost  function,  an  expression  for  the  risk  is  formed 


/OO  POO 

/  C[a,a((R)]pA,R.(a,r)  (3.5) 

-oo  J  —  OO 

The  Bayes  estimate  is  simply  the  estimate  that  minimizes  the  risk  function.  The  dif¬ 
ficulty  is  that  the  joint  probability,  pA,R.(a,r),  is  normally  not  available.  To  overcome 
this  we  recognize  that  the  joint  probability  can  be  rewritten  as 


Pa,  r(o,  r)  =  pR(r)pj4|R(a|r) 


(3.6) 


This  is  further  complicated  by  the  fact  that  PA|R.(o|r)  is  also  normally  not  defined, 
however  it  can  be  found  via  Bayes  rule 


PA|R.(a|r) 


PR\A(r\a)pA(a ) 
Pn(r) 


(3.7) 


We  will  see  later  it  is  not  necessary  to  have  pR. 
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This  leads  to  the  important  question  of  how  to  handle  problems  where  the  prior 
of  the  parameter  to  be  estimated,  Pa{o>),  is  unknown.  The  next  section  on  maximum 
likelihood  estimation  addresses  how  to  approach  this  problems. 


3.2  Maximum  Likelihood  Estimation 

Recall  from  above  that  R  is  a  vector  of  observations  and  A  is  the  true  value 
we  are  attempting  to  estimate.  If  we  have  no  knowledge  of  the  distribution  function 
of  A,  the  method  in  the  previous  section  cannot  be  used.  Instead  we  choose  as  our 
estimate  the  value  of  A  that  most  likely  led  to  the  observed  values  of  R.  The  first 
step  in  this  process  is  to  define  the  log-likelihood  function 


C(a)  =  In  [pR|x(r|a) 


(3.8) 


The  maximum  likelihood  estimate  is  the  value  of  a  where  this  function  is  maximized. 
If  the  maximum  exist  in  the  range  of  a,  and  C(a)  has  a  continuous  first  derivative, 
then  ami  is  found  by 


dC(a) 


da 


=  0  (3.9) 

I  a=arni 

Often  this  derivative  is  difficult  to  solve  in  closed  form  and  we  must  resort  to  iterative 
methods;  one  such  method  is  the  Richardson- Lucy  Algorithm  and  it  is  discussed  in 
the  next  section. 


3.3  Richardson- Lucy  Algorithm 

The  Richardson- Lucy  [30]  algorithm  is  a  modified  gradient  ascent  method  used 
for  finding  the  maximum  of  Poisson  likelihood  functions,  where  dC^  is  the  gradient  of 
the  likelihood  function  C(a).  A  typical  gradient  ascent  algorithm  begins  by  assuming 
C{a)  is  maximum  at  amax.  Next  an  arbitrary  starting  point,  a 0,  is  chosen  and  the 
value  of  \a=ao  is  found;  if  this  value  is  positive  then  amax  >  ao,  if  it  is  negative 
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then  amax  <  a0.  We  now  calculate  a i  according  to 


an  —  an- 1  +  ft 


dC{a) 

da 


I  a=an  —  1 


(3.10) 


This  algorithm  iterates  until  1  |0=a  «  0.  The  positive  constant,  n,  must  be 

sufficiently  small  to  allow  for  convergence. 

The  Richardson- Lucy  algorithm  performs  gradient  ascent  by  breaking  dC^  into 
positive  and  negative  parts 


dC(a) 

da 


dC(a)+  dC(a) 

da  a=an  +  da  a=an 


(3.11) 


It  should  be  clear  that  if  dCi",]+  I  >  dC^)  I  then  I  is  positive  and 
therefore  amax  >  an,  conversely  if  |a=an  <  ^ -  \a=an  then  ^  |a=an  is  nega¬ 

tive  and  amax  <  an.  The  R.-L  algorithm  forms  a  ratio  of  dCi'”i  1  |a  :  dC[i'l'u  1  \a=CLn  and 
updates  an  according  to 


an 


dC(a)+  | 

da  I  a=an-i 

Qn-1  d£(q)-~j 

da  \a=an—i 


(3.12) 


3-4  Estimator  Quality 

Once  we  have  established  an  estimation  routine,  we  would  like  to  determine  the 
quality  of  our  estimator.  To  do  this  we  would  attempt  to  find  the  bias  and  the  variance 
of  the  estimator.  The  bias  of  our  estimator  is  defined  by  the  following  equation 


B{a)  =  E  [a(R)  —  a 


(3.13) 


If  the  bias  is  zero  we  say  we  have  an  unbiased  estimator,  if  it  is  non-zero  and  not  a 
function  of  a  we  have  a  known  bias,  if  it  is  a  function  of  a  we  have  an  unknown  bias. 

For  an  iterative  Richardson-Lucy  (R-L)  algorithm,  the  bias  cannot  be  directly 
calculated.  To  find  the  bias  for  this  estimator,  we  take  advantage  of  the  fact  that  for 
Poisson  statistics  the  R-L  algorithm  converges  to  the  maximum  likelihood  solution 


[35].  If  we  can  show  that  a  maximum  likelihood  estimate  is  unbiased,  the  R-L  estimate 
will  also  be  unbiased. 

Even  an  unbiased  estimator  can  give  unacceptable  results  for  a  particular  trial; 
the  estimator  may  indeed  have  a  pdf  centered  on  the  true  value,  but  a  large  variance. 
It  is  desirable  for  an  estimator  to  have  a  small  variance  in  the  estimated  values.  It 
is  often  difficult  to  calculate  the  variance  of  an  estimator;  it  is  usually  much  easier 
to  calculate  a  lower  bound  on  the  estimator  variance  and  then  compare  the  actual 
performance  of  the  algorithm  to  this  lower  bound.  One  lower  bound  for  unbiased 
estimators  is  the  Cramer-Rao  bound  defined  by 


Var  fo(R)  —  a]  >  E 


dhipR^Crla) 

da 


“I  2 


-1 


(3.14) 


or  equivalently 


Var  [a(R)  —  a]  > 


-E 


d2  In  pR| 


a  r  a 


da 2 


-i 


(3.15) 


where  the  Erst  and  second  partial  derivatives  are  assumed  to  exist  and  be  absolutely 
integrable.  Any  estimate  that  satisfies  this  bound  is  called  an  efficient  estimate.  The 
inverse  of  the  bound  is  referred  to  as  the  Fisher  information. 


3.5  Multiple  Parameter  Estimation 

All  of  the  above  techniques  can  be  applied  to  multiple  parameter  estimation 
problems  by  simply  forming  a  vector  of  parameters  to  be  estimated.  This  section  is 
dedicated  to  demonstrating  this  and  to  defining  a  few  operators  to  simplify  notation. 

Let  a  =  [a i  a2  a3  ....  a*]T  be  a  vector  of  parameters  that  we  wish  to  estimate. 
From  this  we  can  write  a  joint  log- likelihood  function 

£(a)  =  In  [pR|A(r|a)]  (3.16) 
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The  maximum  likelihood  estimate  is  found  by 


v„  |£(a)]  |a=4 


(3.17) 


where 


V„  = 


d  d  d  d 
da  i  da<2  da3  dat 


3.5.1  Estimator  Quality. 


(3.18) 


Bias.  Since  we  have  an  estimate  vector,  we  will  also  have  a  bias 
vector.  The  bias  vector  is  defined  as 


B(a)  =  £[ae(R)]  (3.19) 

where 

ae(R)  =  a(R)  —  a 

and  the  expected  value  of  the  vector  is  defined  by 

E[ a]  =  [E[ai]  E[a2]  E[a3]  ...E[aN}]T 
We  call  an  estimate  unbiased  if  each  component  of  the  bias  vector  is  zero. 

Covariance  Matrix.  For  the  single  parameter  case  we  define  the 
spread  of  the  error  by  the  variance  of  the  estimate;  for  the  multiple  parameter  case 
the  analogous  quantity  is  the  covariance  matrix 

Ae  =  E[( ae  -  B(a))(ae  -  B(a))T]  (3.22) 

Much  like  in  the  single  parameter  case,  it  is  not  always  practical  to  try  and  calculate 
the  values  in  Ae.  Instead  we  will  calculate  a  lower  bound.  First  we  find  the  Fisher 


(3.20) 


(3.21) 
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information  matrix,  J,  whose  elements  are  found  by 


Jij  =  E 


d  In  [pR|A(r|a)]  d  In  [pR|A(r|a)_ 


8  CL; 


da. 


(3.23) 


which  can  also  be  written  as 


Jij  =  ~E 


d2  In  [pR|A(r|a)_ 


dajdaj 


(3.24) 


Now  let  K  =  J  1 .  The  lower  bounds  on  the  variance  of  a,  are 


1 1  J  b  /  / 


(3.25) 
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IV.  A  Review  of  Deconvolution 


4-1  Introduction 

The  purpose  of  this  chapter  is  to  report  the  results  of  work  done  by  MacDonald 
[23];  the  work  is  foundational  to  this  research  and  is  therefore  worth  including.  Much 


of  this  chapter  is  adapted  from  his  dissertation.  This  chapter  will  also  demonstrate  a 


modified  stopping  method  for  the  algorithm  that  we  believe  to  be  more  accurate  and 
reliable. 

4-2  Problem  description  and  geometry 

As  stated  previously,  deconvolution  is  one  method  for  recovering  images  using 
image  plane  data;  a  simple  imaging  system  is  shown  in  Figure  4.1.  Most  deconvolu¬ 
tion  methods  are  applicable  to  incoherent  imaging,  however  this  research  deals  with 
imaging  coherently  illuminated  objects.  A  coherent  speckled  image  cannot  be  di¬ 
rectly  processed  using  traditional  deconvolution  techniques.  To  use  deconvolution  an 
incoherent  image  must  be  formed  (or  approximated)  from  many  frames  of  coherent 
imagery  [25]. 

4-2.1  Geometry.  The  coordinate  system  for  the  deconvolution  problem  will 
use  (x,  y )  to  describe  positions  in  the  object  plane  and  (u,  v)  to  describe  position  in 
the  image  plane.  To  simplify  notation,  we  will  define  the  following  variable  in  R2  to 
represent  the  ordered  pairs 


X  =  (x,y) 


U  =  (u,  v ) 


Image  Plane 


Figure  4.1:  Illustration  of  a  simple  imaging  system 
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4-2.2  Data  and  statistical  models.  In  order  to  derive  an  estimator  one  must 
have  information  on  the  statistics  of  the  data,  as  well  as  understand  what  the  ideal 
image  that  would  be  formed  in  the  absence  of  noise. 

4-2.2. 1  Data  Model.  The  model  for  the  image  plane  irradiance  takes 
advantage  of  the  fact  that  the  multi-frame  average  of  laser  speckle  images  converges  to 
the  incoherent  model  [24] .  This  is  due  to  the  assumption  that  the  phase  at  the  target 
is  random  and  independent  from  observation  to  observation  in  a  manner  consistent 
with  the  time  varying  phase  distribution  produced  by  incoherent  light.  By  capitalizing 
on  this,  the  image  plane  model  becomes  a  convolution  of  the  geometric  image  of  the 
source  irradiance  and  a  Point  Spread  Function  (PSF).  The  PSF  would  include  the 
effects  of  the  optical  system  and  the  path  turbulence.  This  convolution  is  written 
discretely  to  facilitate  the  derivation  of  a  computer  algorithm  for  image  recovery. 

i(U)  =  [o  *  Si](U)  =  o(X)Si(U  -  X )  (4.1) 

x 

where  o( X)  is  the  geometric  image  of  the  source  intensity  and  Si(U)  is  the  image  plane 
PSF.  Since  in  cases  of  interest  to  this  research,  the  aperture  is  large  with  respect  to 
the  Fried  parameter  of  the  turbulence,  the  PSF  is  dominated  by  atmospheric  effects. 
The  long  exposure  PSF  must  be  used  since  the  Signal  to  Noise  Ratio  (SNR)  of  fully 
developed  speckled  frames  is  always  equal  to  1;  making  tip-tilt  removal  problematic. 

To  complete  the  data  model,  we  must  look  at  the  dominate  noise  sources  in  the 
system,  and  form  an  appropriate  statistical  model  to  describe  the  corrupted  data. 

4 .2. 2. 2  Statistical  Model.  To  facilitate  forming  a  statistical  model  we 
will  define  a  single  frame  of  image  plane  data  as 

4“\u)  =  HU)  +  nfl,(U)  +  n  (4,2) 
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where  i  is  the  ideal  incoherent  image,  ushot  is  noise  associated  with  random  photon 
arrivals,  and  n^eckle  is  a  noise  term  that  captures  the  effects  of  speckle  and  any  other 
deviations  from  the  model  resulting  from  the  coherent  illumination.  Both  speckle 
and  shot  noise  are  assumed  independent  from  pixel  to  pixel;  this  assumption  allows 
for  computational  efficiency.  The  assumption  for  shot  noise  is  common  and  will  not 
be  rigorously  justified;  however,  for  the  case  of  speckle  noise  we  will  investigate  the 
validity  of  this  assumption  in  Appendix  C.  If  we  sum  Ad  frames  of  speckle  image 
data,  c/(: ,  we  have  a  new  random  variable,  dj.  We  are  interested  in  describing  the 
statistics  of  dj.  When  Ad  =  1  the  dominate  noise  is  nspeckie  and  d*  is  exponential 
distributed.  As  Ad  — >  oo  then  nshot  becomes  the  dominate  noise  source  and  d*  is 
Poisson  distributed.  It  would  be  desirable  to  have  a  single  distribution  that  was  valid 
for  all  Ad.  The  distribution  that  best  captures  all  of  these  effects  is  the  negative 
binomial  distribution  [19] 


p(K) 


T{K  +  Ad) 
T(K  +  l)T(Ad) 


i  +  il 

K 


-I<r  x 

1+M 


-M 


(4.3) 


where  K  is  a  random  variable  representing  photocounts  and  K  is  the  mean  number 
of  photocounts. 


4-3  Derivation 

Start  with  the  likelihood  function  based  on  the  negative  binomial  distribution  [23] 

C{o)  =  Y  di(U )  ln[i([/)]  -  [di(U)  +  Ad]  In \i(U)  +  Ad]  (4.4) 

u 

Since  we  are  summing  multiple  frames  of  coherent  data,  the  likelihood  function  of 
partially  coherent  data  is  used,  where  Ad  is  the  number  of  frames.  The  model  for 
i(U)  is  the  incoherent  model,  which  is  easily  described  as  a  convolution 

i(U)  =  Y°(x)si(U^X)  (4.5) 
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Using  this  definition  of  i(U)  we  can  write  C{6)  to  show  the  dependance  on  o 


(U)ln[J2o(X)h(U-X)]-[di(U)+M\ln[J2o(X)h(U^X)+M\  (4.6) 


u 


X 


X 


Recall  that  a  maximum  likelihood  estimate  is  defined  by 


dC(o) 

do 


I  o=6  =  0 


(4.7) 


Solving  for  the  required  derivative  we  arrive  at 


dC{o) 

do 


zmh{u-x)-*jg±£w-x) 


u 


m 


i(U)  +  M 


(4.8) 


Next  we  implement  a  Richardson-Lucy  type  algorithm  to  solve  for  o 


Err  *££h(U  -  X ) 

°new(X)  =  0oid( X  )—  di(u)+M  — 

i(U)+M 


(4.9) 


Implementation  is  simplified  by  recognizing  that  this  can  be  written  in  terms  of  cor¬ 
relations 


Onew(.X')  ^0/6  (  N  ) 


\dT*h}(X) 

i*£&*h]{X) 

where  -k  is  used  to  represent  the  correlation  operation. 


(4.10) 


4-4  Stopping  the  algorithm 

In  order  for  this  algorithm  to  be  useful,  we  have  to  be  able  to  stop  it  at  or  near 
the  optimal  iteration.  MacDonald  demonstrated  one  technique  [23],  but  it  relied  on 
some  assumptions  that  we  would  like  to  remove.  Any  stopping  criteria  should  be 
based  on  the  statistics  of  the  data,  and  not  require  a  priori  knowledge  of  the  object. 
Since  we  cannot  reliably  use  estimation  to  remove  the  noise,  the  algorithm  should  be 
stopped  when  the  variance  of  the  data,  dt,  around  the  estimated  mean,  i,  are  close  to 
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the  variance  of  the  data  around  its  sample  mean 


A-1  £(d? 


i)2  <  aa2 


(4.11) 


where  i  is  the  image  estimate,  o2  is  the  variance  of  the  data  around  the  sample  mean, 
and  a  is  a  parameter  that  lets  the  user  choose  the  degree  of  damping.  This  parameter 
will  be  discussed  later.  When  the  above  condition  is  satisfied  for  a  given  pixel,  the 
iterations  for  that  pixel  are  terminated.  In  order  to  stop  each  pixel  independently  we 
add  a  binary  mask ,771* ,  to  our  algorithm 


®  Oold  r  r 


(1  —  rrii +  m 


k  Si 


(1 


mi)  ijw +m* 


-k  S  i 


(4.12) 


When  the  algorithm  begins  m*,  is  a  matrix  of  zeros;  the  algorithm  is  then  checked  at 
each  iteration  to  find  the  pixels  that  satisfy  Equation  4.11.  When  a  pixel  its  found 
that  satisfies  the  damping  criteria  m,  is  set  to  one  for  that  pixel.  This  effectively  stops 
the  iteration  at  that  pixel  by  forcing  djfUo)  =  i{Uo). 

The  damping  parameter,  a,  gives  the  user  flexibility  to  choose  the  level  of  damp¬ 
ing.  When  a  is  chosen  too  low,  it  is  possible  to  over-iterate  and  amplify  noise;  if  a 
is  chosen  too  high  the  iteration  will  slow  considerably.  In  order  to  provide  some 
guidance  on  choosing  a,  we  turn  to  the  strong  law  of  large  numbers  [6]  which  states 
that  for  independent,  identically  distributed  random  variables  the  sample  mean  will 
approach  the  true  mean,  with  probability  1,  as  the  number  of  sample  increases.  For 
our  purposes  this  means  a  is  inversely  proportional  to  the  number  of  frames;  this  re¬ 
lationship  is  assumed  to  be  linear  for  simplicity.  The  relationship  is  not  truly  linear, 
but  it  is  believed  a  simple  relationship  will  work  over  the  range  of  frame  numbers 
for  this  research.  Using  this  as  a  starting  point,  an  attempt  was  made  to  empirically 
determine  a  function  to  give  an  ” ideal”  value  for  a. 


ol  —  1  -t- 


a 

M 


(4.13) 
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Figure  4.2:  Satellite  image  used  as  the  truth  data  in  simulated  data  sets 


where  a  will  be  optimized  experimentally. 


4-5  Data  Simulation 

The  simulated  data  sets  were  created  using  Matlab®  .  Each  set  consists  of  200 
independently  created  frames  of  both  pupil  plane  and  image  plane  data.  In  order 
to  create  the  frames  of  data  a  held  magnitude  was  defined  in  the  object  plane;  the 
magnitude  in  the  object  plane  is  shown  in  Figure  4.2.  A  random  phase  (uniform  on 
(—mi,  mi])  was  added  to  every  point  in  the  held  to  model  the  surface  roughness.  Each 
phase  is  constructed  to  be  independent  of  every  other  phase  in  the  held.  The  target 
was  sampled  at  the  Nyquist  rate  required  by  the  pupil  plane  aperture.  The  sample 
rate,  A,  is  calculated  according  to 


A  = 


2D 


(4.14) 


where  A  is  the  optical  wavelength,  z  is  the  propagation  distance,  and  D  is  the  diam¬ 
eter  of  the  aperture.  This  held  is  then  propagated  to  the  pupil  plane  using  a  Fourier 
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transform.  It  can  be  shown  that  dne  to  the  random  phase  in  the  target  held  any 
propagation,  regardless  of  propagation  length,  can  be  modeled  as  a  Fraunhoffer  prop¬ 
agation  and  therefore  a  simple  Fourier  transform.  The  imaging  aperture  function  and 
the  turbulence  phase  screen  are  then  applied  to  the  pupil  plane  held  and  an  inverse 
Fourier  transform  was  performed  to  give  us  the  image  plane  held;  the  magnitude 
squared  gives  us  the  image  plane  intensity.  In  equation  form  this  would  be: 

i(U)  =  \p~l  {J7  {U (X)}  Pexp{j9}}\2  (4.15) 

where  P  is  the  aperture  function  and  6  is  the  phase  screen. 

To  account  for  detection  noise  processes,  both  Gaussian-distributed  read  noise 
and  Poisson  distributed  shot  noise  were  applied  to  the  data.  Gaussian  detector  noise 
was  characterized  by  the  standard  deviation  ad.  The  average  detection  SNR  was  then 
defined  by 

SNR  =  .  P  (4.16) 

Vp  +  tf 

where  p  is  the  average  number  of  photo-electrons  per  pixel  in  the  data  set.  To  gen¬ 
erate  data  with  a  desired  detection  SNR,  Equation  4.16  was  inverted  and  the  data 
scaled  to  the  required  average  p.  The  scaled  data  was  then  used  to  generate  Poisson- 
distributed  random  numbers  with  mean  p.  Uncorrelated  Gaussian  random  numbers 
with  a  standard  deviation  of  ad  were  then  added  to  each  pixel.  SNR  was  defined  in 
this  manner  to  be  consistent  with  earlier  work  [22], 

4-6  Results  and  Conclusions 

With  the  deconvolution  algorithm  derived  and  implemented,  the  next  step  is  to 
characterize  its  performance.  The  approach  taken  for  performance  characterization 
was  to  use  a  Monte-Carlo  simulation  consisting  of  100  data  sets  of  200  frames  each 
at  four  different  turbulence  levels.  It  is  often  difficult  to  chose  metrics  to  quantify  the 
quality  of  a  recovered  image,  since  for  almost  any  metric  an  object  estimate  can  be 
chosen  that  is  much  better  or  worse  than  the  chosen  metric  implies.  Initially  mean 
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absolute  error  (MAE)  was  chosen  as  a  metric.  MAE  is  defined  as 


N,M 

MAE  =  (NM)-1  \°~°\  (4.17) 

n=l,m=l 


This  metric  was  quickly  discarded  as  the  errors  were  excessively  large  for  estimates 
that  were  shifted  with  respect  to  the  truth  object.  To  avoid  this  problem  we  choose 
a  slightly  more  complicated  metric  that  is  translation  invariant 


Er  =  min 


E  I o(u 


U0,VQ 


Uq,V  -  Vp)  -  o(u,v) \2 

E|o(«,u)  |2 


(4.18) 


which  is  more  easily  calculated  as 


p  _  f  66 (0,  0)  +  r oo(0, 0)  -  2  maxU0i„0  Re{rod(u0 ,  u0)}  ,  , 

T oo(0,  0)  ’ 

where  rOQ  and  Tod  are  autocorrelations  of  the  object  and  the  estimate,  and  rod  is  the 
cross  correlation.  In  general  correlation  function  are  defined  as 


rab(u,v )  =  EE  a(x,y)b(u  +  a,v  +  b)  (4.20) 

x  y 

For  a  more  complete  explanation  see  [9].  An  estimator  will  seek  to  minimize  this 
metric,  Er. 

The  first  step  in  characterization  was  to  find  an  optimal  value  for  the  damping 
parameter;  this  was  done  using  a  single  turbulence  value  and  then  assumed  to  be 
correct  at  any  turbulence  level;  a  —  15  was  chosen.  The  damping  parameter  was 
chosen  by  running  the  simulation  at  various  values  of  a  and  choosing  the  best  value. 
Once  the  damping  parameter  was  determined,  the  image  recovery  algorithm  was  run 
on  all  the  data  sets  and  the  results  are  presented  below. 


4-6.1  Baseline  results.  In  order  to  determine  the  effects  that  certain  system 
parameters  (frames,  turbulence)  have  on  system  performance  we  need  to  define  and 
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Figure  4.3:  Sum  of  200  frames  of  raw  image  plane  data 

characterize  a  baseline  case.  For  this  research  the  baseline  case  is  a  200  frame  data 
set  generated  in  turbulence  described  by  ^  =  10.  The  sum  of  one  set  (200  frames) 
of  raw  image  plane  data  is  shown  in  Figure  4.3.  A  sample  image  recovered  from  this 
scenario  can  be  seen  in  Figure  4.4  Clearly  the  recovered  image  is  improved  over  the 
raw  data,  however  no  high  spatial  frequency  details  have  been  recovered.  We  have 
a  better  idea  of  the  shape  and  extent  of  the  object,  but  little  information  about  the 
fine  structure.  To  quantify  the  improvement,  the  error  function  (Er)  was  calculated 
at  each  iteration.  This  was  done  for  100  data  sets  and  Er  was  plotted  with  error  bars 
representing  +/-  one  standard  deviation.  The  results  are  shown  in  Figure  4.5 

4-6.2  Effects  of  varying  the  number  of  data  frames.  Recalling  from  the  ear¬ 
lier  development  that  the  model  for  the  data  was  based  on  the  incoherent  convolution 
model,  it  should  stand  to  reason  that  as  the  number  of  frames  increases  the  data  will 
more  closely  match  the  model  and  therefore  the  recovered  images  should  be  better. 
The  effects  of  the  number  of  frames  will  be  quantified  by  running  the  deconvolution 
algorithm  on  data  sets  of  50  and  100  frames  and  comparing  the  recoveries  to  the 
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Figure  4.4:  Sample  image  recovered  using  3500  iterations  of  deconvolution 
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Figure  4.5:  Er  versus  iterations  for  baseline  simulation.  The  mean  of  100  recoveries 
was  plotted  with  error  bars  representing  +/-  one  standard  deviation 


200  frame  reconstructions.  Figure  4.6  shows  images  recovered  using  50,  100  and  200 
frames  of  data;  while  the  differences  are  not  dramatic,  clearly  the  quality  of  the  image 
improves  with  additional  frames.  To  really  understand  the  benefit  of  increasing  the 
number  of  data  frames  we  look  at  the  variance  of  Er  for  the  object  estimates.  As  seen 
in  Figure  4.7,  the  variance  increases  dramatically  as  the  number  of  frames  decrease. 
What  this  means  from  a  practical  viewpoint  is  that  as  the  number  of  frames  increases 
so  does  the  reliability  of  the  algorithm. 

4-6.3  Effects  of  turbulence  strength  on  the  recovered  image.  While  it  should 
be  obvious  to  the  informed  reader  that  the  strength  of  the  atmospheric  turbulence  will 
have  a  direct  impact  on  the  quality  of  the  recovered  image,  it  will  still  be  quantified 
here.  It  is  shown  to  illustrate  that  the  joint  algorithm  (see  Chapter  6)  will  be  less 
affected  by  turbulence.  Figure  7.2  shows  Er  for  the  recovered  image  at  each  iteration 
for  4  different  turbulence  strengths.  As  expected  turbulence  has  a  large  impact  on 
the  quality  of  the  reconstruction. 

4-6-4  Ability  to  recover  varied  intensity  images.  This  section  seeks  to  quan¬ 
tify  the  ability  of  the  deconvolution  algorithm  to  correctly  recover  the  various  inten¬ 
sities  in  a  single  image  accurately.  A  truth  image  consisting  of  five  squares  of  varying 
intensity  is  shown  in  Figure  4.9;  the  center  square  has  an  intensity  of  1,  while  the 
other  four  have  intensities  of  0.1,  0.25,  0.5,  and  0.75  (starting  in  the  upper  left  and 
moving  clockwise).  In  the  recovered  image  it  is  desired  for  the  ratios  of  intensity  in 
each  corner  to  the  intensity  of  the  center  square  to  be  accurately  recovered.  The  re¬ 
covered  image  can  be  seen  in  Figure  4.10.  The  mean  intensities  in  each  corner  square 
(as  a  percentage  of  the  mean  intensity  in  the  center  square)  in  the  recovered  image  are 
9.78%,  23.7%,  48.8%  and  73%.  Clearly  the  algorithm  can  recover  the  proper  intensity 
values  in  gray  scale  images.  It  is  further  noted  that  greater  than  90%  of  the  energy 
in  the  held  of  view  is  in  the  image. 
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Figure  4.6: 


(a)  50  Frames 


(b)  100  Frames 


(c)  200  Frames 

Images  recovered  using  different  numbers  of  data  frames. 
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Figure  4.7: 


Figure  4.8: 


Variance  of  the  estimate  for  varying  numbers  of  frames  of  data 


Er  of  recovered  images  shown  for  varying  turbulence  levels 
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Figure  4.9:  Truth  image  used  for  quantifying  intensity  recovery  accuracy 


Figure  4.10:  Recovered  gray  scale  image. 
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4-6.5  Quantification  of  resolution  of  the  algorithm.  To  define  the  resolution 
of  the  algorithm,  various  bar  targets  are  imaging.  The  system  resolution  is  defined 
as  the  pixel  spacing  between  bars  in  the  targets  when  the  algorithm  can  just  resolve 
them;  the  bars  will  be  said  to  be  resolved  when  the  intensity  of  the  valleys  fall  to 
less  than  half  that  of  the  peaks,  and  any  additional  bars  in  the  recovered  image 
are  suppressed  by  at  least  a  factor  of  5.  The  deconvolution  algorithm  was  able  to 
resolve  the  bars  when  they  were  separated  by  10  pixels.  The  results  are  shown  in 
Figurea  s4. 11-4. 12.  One  could  argue  that  the  human  eye  can  separate  the  bars  in 
the  raw  data,  however  the  criteria  for  resolution  is  the  valleys  must  have  less  than 
half  the  photons  of  the  peaks.  This  is  clearly  not  the  case  for  the  raw  data,  but  is 
satisfied  by  the  deconvolved  case.  The  slices  that  are  plotted  in  Figures  4.11b)  and 
(4.12b  are  the  mean  of  20  vertical  slices  from  the  center  of  the  image  data. 
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Figure  4.11: 


(a) 


(b) 


Raw  image  data  of  a  resolution  target.  10  pixels  separate  the  bars 
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Figure  4.12: 
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(b) 

Deconvolved  image  data  of  a  resolution  target.  10  pixels  separate  the 
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V.  Maximum  Likelihood  Pupil  Plane  Image  Recovery 

5. 1  Introduction 

This  chapter  will  demonstrate  a  high  resolution  pupil  plane  imaging  system. 
The  algorithm  proposed  here  is  a  Maximum  Likelihood  pupil  plane  algorithm. 

5.2  Problem  description  and  geometry 

We  have  previously  described  methods  for  recovering  images  from  pupil  plane 
intensity  measurements.  This  can  have  many  advantages  since  the  arrays  can  be 
made  large  and  conformal.  This  raises  the  theoretical  resolution  limit  without  a  large 
increase  in  system  volume,  ffowever,  pupil  plane  systems  suffer  from  some  widely 
known  stagnation  problems  [15].  This  chapter  derives  and  implements  a  new  pupil 
plane  algorithm.  This  new  algorithm  does  not  eliminate  the  existing  stagnation  issues, 
but  is  formed  in  a  manner  that  will  allow  it  to  be  easily  fused  with  the  image  plane 
system  of  chapter  4. 

5.2.1  Geometry  and  Assumptions.  The  coordinate  system  for  all  the  fol¬ 
lowing  derivations  will  use  (x,  y) functions  for  positions  in  the  object  plane,  (a, /3)  in 
the  pupil  plane,  ( u ,  v )  in  the  image  plane,  and  (£,  //)  for  shifts  in  the  autocorrelations, 
ffowever  to  simplify  notation,  we  define  the  following  variables  in  M2  to  represent  the 
ordered  pairs 

X  =  (x,  y) 

A  =  ( a, /3 ) 

*  =  (Z,v) 

U  =  (u,v) 

ft  also  must  be  pointed  out  that  for  the  entire  paper  any  matrix  product  is  a  ffadamard 
(or  element  wise)  product. 
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5.2.2  Pupil  Plane  Data  Model.  The  pupil  plane  model  is  based  on  work 
from  Fienup  and  Idell  [21],  which  relates  the  autocorrelation  of  the  object  brightness 
function  to  the  average  magnitude  squared  of  the  Fourier  transform  of  the  pnpil  plane 
intensity.  The  autocorrelation  of  the  object  brightness  function  is  given  by 

=  £  o(A>(T  +  X)  (5.1) 

x 

Image  recovery  begins  with  estimating  the  average  energy  spectrum  of  the  observed 
speckle  pattern  by  averaging  the  squared  moduli  of  many  independent  speckled  au¬ 
tocorrelations 

fiTm  =  [I^a[/W(V)]|2H(A)]  (5.2) 

where  / is  the  kth  realization  of  the  held  reflected  from  the  object  and  the  aperture 
function,  p[  defines  the  region  over  which  the  speckle  pattern  is  observed  in  the  pupil 
plane,  P^ !i>  'n  an  inverse  Fourier  transform  which  operates  in  a  function  in  A  and 
returns  a  function  of  T,  and  Px*a  is  a  Fraunhoffer  propagation  operator  [20].  It  has 
been  shown  that  as  the  number  of  independent  speckle  realizations,  J,  increases  the 
average  energy  spectrum  converges  to  [21] 

K 

lim  K-1  Y  |i?Jfc)(T)|2  =  bsJV)  +  cR0{ T)  *  s„(T)  (5.3) 

K— >oo  ^ ' 

k= 1 

where  b  and  c  are  constants,  and  sp(\l/)  =  |h('F)|2  is  the  PSF  of  the  pupil  plane 
aperture. 

The  intensity  in  the  pupil  plane  is  related  to  the  object  held  by 

pupiPH A)  =  I^!a{/‘W}|2  +  nf’jA)  +  nfpim(A)  (5.4) 

where  n^ot  is  shot  noise  from  random  photon  arrival  times  and  nf^eckle  is  the  speckle 
noise  associated  with  coherent  systems.  It  is  convenient  at  this  point  to  define  a 
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variable  to  represent  a  single  frame  of  transformed  pupil  plane  data 


{|^£a{/wPO}I2  +  dl(A)  +  ™£cHe(A)} 


(5.5) 


where  k  is  the  frame  number.  According  to  Equation  5.2,  we  can  average  many 
frames  of  this  transformed  data  to  represent  the  average  speckled  autocorrelation  of 
the  object  brightness  function. 

K 

dr('Sf)  =  K-1J24k\y)  (5-6) 

k= 1 


Provided  that  K  is  large  dr  will  be  approximately  Gaussian  with  mean  i?('P)  = 
6 1 /?. ( \P ) | 2  +  c.Ro('b)  *  |h('k)|2.  From  this  we  can  write  an  equation  for  the  probability 
distribution  of  a  single  pixel 


PDr(f)(dr('h)) 


exp 


zMh®!:! 

2cr2  J 


(5.7) 


If  we  also  assume  that  the  noise  in  each  pixel  is  independent  of  the  noise  in  all  the 
other  pixels  we  can  write  the  equation  for  the  probability  of  realizing  an  entire  ”  scene” 


Pd„ 


( dr\o,h )  =  J] 


exp 


~[dr(g)-.R(tt)]2 

2cr2 


V2tt> 


H  <7 


(5.8) 


5.3  Image  Recovery  Algorithm 

The  algorithm  development  begins  by  defining  the  log-likelihood  function 


where 


£(o,  b,  c )  =  lnpDr(dr\o,  sp)  =  ^ 


-[dr(^)-R^,b,c)f 

2<r2(T) 


R(^,b,c)  =  5sp(T)  +  c[R0  *  sp](T) 


(5.9) 


(5.10) 
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The  maximum  likelihood  (ML)  estimator  is  defined  by  [34] 


V£(A)U=a  =  0  (5.11) 

where  A  =  [o  b  c]T  and  a  —  [6  b  c]T 

An  estimate  of  o  is  difficult  to  find  in  closed  form,  but  rather  can  be  solved 
iteratively  after  we  have  initial  estimates  of  b  and  c. 


5.3.1  Estimates  of  b  and  c.  To  solve  for  initial  estimates  of  b  and  c  we 
restate  the  problem  as  the  likelihood  function  of  b  and  c  conditioned  on  o 


v  -KW  -  bsp(9)  -  c[fl0  * 

~  A 


(5.12) 


The  new  estimation  routine  is  defined  by 


V£(A)U=a  =  0 


(5.13) 


where  A  =  [b  c]T  and  a  =  [b  c]T .  These  estimates  will  be  updated  as  the  value  of  o  is 
refined.  To  find  values  of  b  and  c  we  must  find  the  gradient  of  C{A) 


V£(6,c) 


dC(b,  c ) 
db 


d C(b,  c ) 
dc 


(5.14) 


Solving  these  two  derivatives  separately  and  setting  them  equal  to  zero  gives  us  a  set 
of  simultaneous  equations. 


dC{b,c)  _  ^  [dr{V)-bsp(V)-c[R<)*sp]{*)]sp{V) 
db  <r2(T) 


dC(b,c)  _^[dr^)-bsp^)-c[R0*sp}m[R0*sp}^) 
dc  a2(T) 
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Next  we  solve  Equation  6.22  for  c  as  a  function  of  b 


Substitute  Equation  6.23  into  Equation  6.21  and  solve  for  b 

i  Eg.  ~  Pi  E^  *  SplW 


(5.17) 


(5.18) 


where 


dr(W)[B<)*sp](W) 

Ev[Ro*sP}2m 


T^miRo^pim 

ZARo*sP\*m 


(5.19) 


(5.20) 


Finally,  substitute  the  value  of  b  into  Equation  6.23  to  calculate  a  value  for  c. 


5.3.2  Estimation  algorithm  for  o.  Next  we  will  build  an  iterative  algorithm 
to  solve  for  o.  It  is  informative  here  to  write  the  log-likelihood  function  to  show  the 
dependance  on  o.  We  do  this  by  recognizing 


[R0  *  SpK'b)  =  EE  o(X)o(U  +  A>p(tf  -  U)  (5.21) 

u  x 

Combining  this  result  with  Equation  5.10  and  (6.14)  we  are  able  to  write 


r,  „  ,  ^  -K(»)  -  M*)  -  cE„  E.v  +  -'')%(*  -  0(»)l’ 

111  -  2,  2<72(®) 


Next  we  take  the  derivative  of  this  log-likelihood  with  respect  to  o 

dC(o\b,  c)  v  K(jQ  -  bsp(V)  -  c[R0  *  sp]($)]cEc/  ~  U) 

do  cr2(4/) 


(5.22) 


where 


OR 

do 


o(X  +  U)  +  o(X  -  U) 


(5.23) 


53 


Below  are  a  few  definitions  to  simplify  this  equation 


f)R 

cE  ~u)  =  c{[°  *  6pKvI/ + x)  +  [°*  -  x)}  (5-24) 


$i(T  +  X)  =  [o*sp](T  +  X) 

4>2(T-X)  =  [o*sp](tf-X) 

(5.25) 

m  MT)  +  c[R0*sp){^) 
f[  ’  u2(T) 

(5.26) 

Using  the  above  definitions  (and  some  further  simplification)  we  are  able  to 
write  a  R-L  algorithm 


Onew  (X)  =  oold(X) 


c[£*$i]  (X)  +  c[£*$2]  C X ) 
\p*$i]  (X)  +  [p*$2]  (X) 


(5.27) 


5.3.3  Stopping  the  Algorithm.  Image  recovery  techniques  built  around  a 
Richardson-Lucy  algorithm  are  often  ran  for  a  set  number  of  iterations  and  then  ex¬ 
ited.  It  would  be  better  to  have  an  optimized  method  of  exiting  the  iterations.  It  is 
fairly  well  known  that  iterating  beyond  an  optimal  point  can  lead  to  noise  amplifica¬ 
tion.  This  is  due  to  the  fact  that  all  Maximum  likelihood  techniques  attempt  to  fit 
the  data  as  closely  as  possible  given  the  constraints  of  the  problem.  This  leads  to  the 
question  of  when  do  we  stop  our  algorithm.  Rather  than  stop  the  algorithm  we  will 
look  at  methods  to  damp  the  iteration  to  avoid  over  iteration  regardless  of  how  long 
the  algorithm  is  run.  This  means  one  can  iterate  for  as  long  as  time  allows  with  some 
assurance  that  iteration  z  +  1  will  never  be  worse  that  iteration  z. 

The  damping  routine  looks  at  the  statistics  of  the  data  set  and  compares  it  to 
the  model’s  statistics.  When  the  variance  of  the  model  is  near  the  variance  of  the 
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data  that  pixel  is  damped.  Stated  mathematically 


K 

K-1  J2(dr  -  R?  <  acTr  (5-28) 

fc= i 

where  of  is  the  measured  variance  of  the  data  set  and  a  is  a  user  chosen  value  to  de¬ 
termine  the  degree  of  damping  (guidance  for  choosing  a  can  be  found  in  section(4.4)); 
R  is  defined  as 

i?('b)  =  Onew  (AT)  Onew  (  T  +  X)  (5.29) 

Using  the  above  criteria  we  can  create  a  binary  map,  mr,  that  will  damp  the 
iteration  for  the  pixels  where  the  criteria  are  satisfied.  The  values  of  the  binary  map 
will  be  set  1  for  every  pixel  where  the  criteria  are  satisfied,  otherwise  the  value  is 
zero.  The  maps  are  updated  at  each  iteration.  The  maps  are  applied  to  the  update 
equation  so  that  where  ever  the  map  equals  1  the  data  and  the  model  are  forced  to 
be  equal,  essentially  stopping  that  pixel  from  changing  for  that  iteration.  Using  these 
criteria  we  restate  Equation  5.26  and  5.27 


p(T)  =  6,sp(T)  +  c 


(1  -  mr)[Ro  *  sp](T)  +  mr 


dr  —  bsp 


Onew  (X)  =  oold{X) 


C  [dr  *  $X]  (X)  +C[dr*  $2]  (X) 


[p*4>i]  (X)  +  [p*$2]  (X) 
where  mr  is  the  binary  map  applied  to  the  pupil  data. 


(5.30) 

(5.31) 


5.4  Results 

This  section  will  compare  the  results  of  the  proposed  joint  image  recovery  algo¬ 
rithm  to  that  of  deconvolution.  This  simulated  data  sets  were  created  in  a  manner 
identical  to  that  outlined  in  Chapter  4.  After  the  data,  sets  were  created,  two  image 
recovery  algorithms  were  run:  1)  deconvolution  [25]  and  2) the  pupil  plane  algorithm. 
The  pupil  plane  algorithm  is  an  intermediate  step  to  a  more  complete  joint  algorithm 
that  will  be  discussed  in  later  chapters;  for  this  reason  the  pupil  plane  analysis  is  not 
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as  thorough  as  the  deconvolution.  The  analysis  will  only  consist  of  evaluating  the 
resolution  of  the  algorithm  and  the  ability  of  the  algorithm  to  recover  varied  inten¬ 
sity  images.  A  section  will  also  be  devoted  to  showing  some  of  the  limitations  of  the 
algorithm  as  motivation  for  the  remaining  work. 

5-4-1  Quantifying  resolution.  The  simulation  was  run  in  the  absence  of 
turbulence  to  get  a  diffraction  limited  image  which  is  shown  in  Figure  5.1a.  Clearly 
in  the  absence  of  turbulence  the  imaging  system  would  have  no  problem  resolving 
the  bars  in  the  target.  The  simulation  is  then  run  with  D /tq  =  10  and  the  resulting 
image  is  shown  in  Figure  5.1b.  In  this  case  none  of  the  bars  are  resolved.  In  order 
to  try  and  regain  the  lose  resolution  we  apply  the  deconvolution  algorithm  and  let  it 
run  to  convergence;  the  resulting  image  is  shown  in  Figure  5.1c  and  again  the  bars 
cannot  be  resolved.  Figures  5.1d-e  show  the  results  of  the  pupil  algorithm;  clearly 
we  have  satisfied  the  resolution  criteria  from  chapter  4.  In  draper  4,  it  was  shown 
that  deconvolution  had  a  resolution  limit  of  10  pixels;  the  pupil  plane  algorithm  has 
a  resolution  limit  of  5  pixels  for  the  given  turbulence  scenario. 

5-4-2  Ability  to  recover  varied  intensity  images.  In  section(4.6.4)  it  was 
shown  that  deconvolution  algorithms  are  able  to  accurately  recover  various  intensities 
in  a  single  image.  That  same  analysis  was  performed  for  the  pupil  plane  algorithm 
with  very  different  results.  The  pupil  plane  algorithm  results  are  shown  in  Figure  5.2. 
Figure  5.2a  shows  the  truth  image  and  the  recovered  image  is  shown  in  Figure  5.2b. 
Clearly  the  pupil  algorithm  cannot  recover  the  varied  intensity  in  the  image.  This 
is  easily  explained  by  looking  at  Figure  5.2c,  which  shows  the  autocorrelation  of 
the  truth  object.  Since  the  pupil  algorithm  estimates  the  image  from  a  corrupted 
measurement  of  the  autocorrelation  (which  is  symmetric)  and  the  autocorrelation  to 
object  relationship  is  not  one  to  one,  the  algorithm  will  choose  an  object  that  satisfies 
the  known  constraints.  In  the  absence  of  information  to  the  contrary,  the  algorithm 
will  recover  a  symmetric  object. 
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(a) 


(b) 


(c) 


50  100  150  200  250  300 


(d) 


(e) 


Figure  5.1:  Results  of  image  recovery  algorithms  on  a  bar  target  where  5  pixels 
separate  the  bars,  (a)  show  a  diffraction  limited  image  (b)  show  the  raw  data  in  the 
presence  of  turbulence  (c)  shows  the  results  of  deconvolution  (d)  shows  the  results  of 
the  pupil  algorithm  and  (e)  shows  a  slice  throught  (d) 
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(a) 


(b)  (c) 


Figure  5.2:  (a)  Image  with  intensity  variations  (b)  Image  recovered  using  pupil 

algorithm  (c)  autocorrelation  of  the  truth  object 
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5-4-3  Limitations  of  pupil  plane  algorithms.  The  primary  limitation  to  be 
discussed  here  is  the  inability  of  pupil  plane  algorithms  to  recover  complex  targets  in 
the  absence  of  good  object  support.  The  reason  for  this  stems  from  the  fact  that  the 
algorithms  are  generally  based  on  using  the  object  autocorrelation  in  the  data  model. 
Since  an  autocorrelation  function  is  always  symmetric  and  does  not  possess  a  one  to 
one  mapping  with  the  underlying  object,  algorithms  seeking  to  recover  images  from 
autocorrelations  will  have  severe  twin  image  stagnation  problems.  This  can  been  seen 
in  Figure  5.3;  the  satellite  body  is  ’recovered’  in  both  the  lower  right  and  the  upper 
left  of  the  satellite,  while  the  smaller  structures  (antennas)  are  lost  entirely. 


(a)  (b) 

Figure  5.3: 
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VI.  Joint  Data  Algorithm 


6. 1  Introduction 

The  chapter  documents  the  fusion  of  the  algorithms  listed  in  the  prior  chapters. 
The  algorithms  are  fused  using  Bayesian  methods.  The  algorithm  proposed  here  will 
use  an  image  plane  data  set  in  conjunction  with  a  pupil  plane  data  set.  From  the 
two  data  sets  a  Maximum  likelihood  estimator  will  be  formed.  The  use  of  image  data 
with  correlography  methods  is  not  new.  Fienup  suggested  the  use  of  image  plane  data 
in  conjunction  with  pupil  plane  measurements  [14],  but  these  sets  of  data  were  not 
fused  using  Bayesian  methods  to  estimate  an  image. 

Much  of  the  information  from  previous  chapters  will  be  repeated  for  clarity. 

6.2  Geometry  and  Assumptions 

The  coordinate  system  for  all  the  following  derivations  will  use  (x,  ^functions 
for  positions  in  the  object  plane,  (a,  (3)  in  the  pupil  plane,  (u,v)  in  the  image  plane, 
and  (£,  rj)  for  shifts  in  the  autocorrelations.  However  to  simplify  notation,  we  define 
the  following  variables  in  M2  to  represent  the  ordered  pairs 

X  =  (x,y) 

A  =  (a,p) 

T  =  (S,v) 

U  =  (u,v) 

It  also  must  be  pointed  out  that  for  the  entire  paper  any  matrix  product  is  a  Hadamard 
(or  element  wise)  product. 
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6.3  Image  and  Pupil  Plane  Data  Models 

This  section  is  devoted  to  describing  the  statistical  models  for  both  data  sets 
with  the  goal  of  providing  probability  density  functions  (pdf)  for  the  measurements 
in  both  planes. 


6.3.1  Image  Plane  Data  Model.  The  model  for  the  image  plane  intensity 
takes  advantage  of  the  fact  that  the  multi-frame  average  of  laser  speckle  images  con¬ 
verges  to  the  incoherent  model  [24],  This  is  due  to  the  assumption  that  the  phase  at 
the  target  is  random  and  independent  from  observation  to  observation  in  a  manner 
consistent  with  the  time  varying  phase  distribution  produced  by  incoherent  light.  By 
capitalizing  on  this,  the  image  plane  model  becomes  a  convolution  of  the  geometric 
image  of  the  source  intensity  and  a  Point  Spread  Function  (PSF).  The  PSF  would 
include  the  effects  of  the  optical  system  and  the  path  turbulence.  This  convolution 
is  written  discretely  to  facilitate  the  derivation  of  a  computer  algorithm  for  image 
recovery. 

i{U)  =  YJ°(x)*i(U  ~  x)  (6.1) 

x 

where  o(X)  is  the  geometric  image  of  the  source  intensity  and  Si(U)  is  the  image 
plane  PSF.  Since  the  imaging  aperture  is  large  with  respect  to  the  Fried  parameter 
of  the  turbulence,  the  PSF  is  dominated  by  atmospheric  effects.  The  long  exposure 
PSF  is  used  since  the  Signal  to  Noise  Ratio  (SNR)  of  fully  developed  speckled  frames 
is  always  equal  to  1;  making  tip-tilt  removal  problematic. 

The  convolution  model  provides  the  mean  of  the  image  plane  data.  Since  we  are 
approximating  an  incoherent  image  using  a  multi-frame  average,  the  Probability  Mass 
Function  (PMF)  of  the  image  plane  data  is  best  modeled  by  the  PMF  of  partially 
coherent  light  corrupted  by  photon  noise  [19],  the  negative  binomial  distribution  best 
describes  the  probability  of  receiving  di(U)  photons  shown  below  as 


PDi(u){di{U)\o,M,Si) 


T[di(U)  +  M ] 
r  [di(U)  +  1}T[M] 
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(6.2) 
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where  T  is  the  gamma  function,  dj(U)  is  the  measured  data,  D^U)  is  a  random 
variable  representing  the  image  plane  data,  and  A4  is  the  number  of  coherent  data 
frames  used  in  the  formation  of  the  data,  di(U).  Because  the  noise  in  each  pixel  is 
assumed  to  be  independent  and  identically  distributed  (iid)  we  can  write  the  joint  pdf 
of  the  entire  scene  as 


PDiidi)  = 


u 


r  [di(U)  +  M]  r  M 
r [di(U)  +  1}T[M]  [  i(U) 


-di(U) 


1  + 
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(6.3) 


6.3.2  Pupil  Plane  Data  Model.  The  pupil  plane  model  is  based  on  work 
from  Fienup  and  Idell  [21],  which  relates  the  autocorrelation  of  the  object  brightness 
function  to  the  average  magnitude  squared  of  the  Fourier  transform  of  the  pupil  plane 
intensity.  The  autocorrelation  of  the  object  brightness  function  is  given  by 


%>m  =  X  o(X)o(T  +  X)  (6.4) 

x 

Image  recovery  begins  with  estimating  the  average  energy  spectrum  of  the  observed 
speckle  pattern  by  averaging  the  squared  moduli  of  many  independent  speckled  au¬ 
tocorrelations 

<}(T)  =  [\F[f(k\X)]\2H(A)]  (6.5) 

where  T  is  a  Fourier  transform,  fk  is  the  kth  realization  of  the  held  reflected  from 
the  object  and  the  aperture  function, and  H  defines  the  region  over  which  the  speckle 
pattern  is  observed  in  the  pupil  plane.  It  has  been  shown  that  as  the  number  of 
independent  speckle  realizations,  K,  increases  the  average  energy  spectrum  converges 
to  [21] 

K 

lim  K-1  y  |4fc)  Wl2  =  bsp(x1')  +  4R0  *  Sp] (tf)  (6.6) 

K—kx  ' 

k= 1 

where  b  and  c  are  constants,  *  is  the  convolution  operator,  and  <Sp(T)  is  the  PSF  of 
the  pupil  plane  aperture. 
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The  intensity  in  the  pupil  plane  is  related  to  the  object  held  by 

pupil^  A)  =  +  nfLW  +  n speckle (^)  (®-^) 

where  nfkot  is  shot  noise  from  random  photon  arrival  times  and  n[kJeckle  is  the  speckle 
noise  associated  with  coherent  systems.  It  is  convenient  at  this  point  to  define  a 
variable  to  represent  a  single  frame  of  transformed  pupil  plane  data 

{i^a{/(‘)P0}|2  +  dl(A)  +  dSc««(A)}f  (6-S) 

where  k  is  the  frame  number.  According  to  Equation  6.5,  we  can  average  many 
frames  of  this  transformed  data  to  represent  the  average  speckled  autocorrelation  of 
the  object  brightness  function. 

K 

dr{y)  =  K-1^2dlk\V)  (6.9) 

fc=i 

Provided  that  K  is  large  dr  will  be  approximately  Gaussian  with  mean 


A(T)  =  bsp(V)  +  c[R0  *  sp](T) 


(6.10) 


This  results  from  the  central  limit  theorem  since  we  are  adding  many  independent, 
identically  distributed  random  variables  [6].  From  this  we  can  write  an  equation  for 
the  probability  distribution  of  a  single  pixel 


PDr(f)(dr('P))  — 


-KW-R(fr)]2 

2  cr2 


(6.11) 


If  we  also  assume  that  the  noise  in  each  pixel  is  independent  of  the  noise  in  all  the 
other  pixels  we  can  write  the  equation  for  the  probability  of  realizing  an  entire  ”  scene” 


exp\-^~RW  n 

PDr(dr\o ,  h)  =  TT - -  (6.12) 

*  v27ru 
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6.4  Joint  Algorithm 

The  algorithm  development  begins  by  defining  a  joint  log-likelihood  function 


C(o,  b,c )  =  Cp(o,b,c)  +  Ci(o) 


(6.13) 


where 


where 


and 


Cp(o,b,c)  =  lnpDr(dr\o,sp)  = 


9 


2  u2(T) 


R(^,b,c)  =  bsp(ty)  +  c[R0  *  sp](T) 


(6.14) 


(6.15) 


Ci(o)  =  In PdM^s^M)  =  J2di(U)ln[i(U)}  -  [d^U)  +  M]ln[i(U)  +  M]  (6.16) 


u 


The  maximum  likelihood  (ML)  estimator  is  defined  by  [34] 


V£(Z)U=a  =  0 


(6.17) 


where  A  —  [o  b  c]T  and  a  —  [6  b  c]T 

An  estimate  of  o  is  difficult  to  find  in  closed  form,  but  rather  can  be  solved 
iteratively  using  initial  estimates  of  b  and  c. 

6-4-1  Estimates  of  b  and  c.  To  solve  for  initial  estimates  of  b  and  c  we 
restate  the  problem  as  the  likelihood  function  of  b  and  c  conditioned  on  o 


C(b,c\o)  = 


—  [dr(T)  —  &sp(T)  —  c[R0  *  <sP]('L)]5 


2  u2(T) 


(6.18) 


The  new  estimation  routine  is  defined  by 


V£(A)U=a  =  0 


(6.19) 
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where  A  =  [b  c]T  and  a  =  [b  c]T .  These  estimates  will  be  updated  as  the  value  of  o  is 
refined.  To  find  values  of  b  and  c  we  must  find  the  gradient  of  C{A) 


V£(6,c) 


dC(b,  c ) 
db 


dC(b,  c ) 
dc 


(6.20) 


Solving  these  two  derivatives  separately  and  setting  them  equal  to  zero  gives  us  a  set 
of  simultaneous  equations. 


dC(b,  c)  _  v  K(^)  -  5sp(T)  -  c[Ro  *  sp](tf  )]sp(tf) 
db  ^  <r2(T) 

dC{b,  c)  _  ^  K(g  -  tepQg)  -  c[R0  *  SgKjQpo  *  sp]{V) 
dc  <t2(T) 

Next  we  solve  Equation  6.22  for  c  as  a  function  of  b 

^[dr^)~bSpmiRo*sPm 

E^O  *  -Sp]2('h) 

Substitute  Equation  6.23  into  Equation  6.21  and  solve  for  b 


(6.21) 


(6.22) 


(6.23) 


t  EgrfrOEQspQE)  -Lp1r^mR^m 

E*s2(T)  -p2E$spW[-ro  *sp\(^) 


(6.24) 


where 


and 


Pi 


P  2 


£*,  dr(V')[Bo*sp](*') 
Ev[Ro*sP]2m 

(6.25) 

E*.  sPmiRo  *  sP]m 

(6.26) 

Finally,  substitute  the  value  of  b  into  Equation  6.23  to  calculate  a  value  for  c. 


6-4-2  Estimation  algorithm  for  o.  Next  we  will  build  an  iterative  algorithm 
to  solve  for  o.  It  is  informative  here  to  write  the  log-likelihood  function  to  show  the 
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dependance  on  o.  We  do  this  by  recognizing 


[Ro  *  Sp]  W  =  EE  o(X)o(U  +  A>p(T  -  U) 


(6.27) 


u  x 


Combining  this  result  with  Equation  6.15  and  (6.14)  we  are  able  to  write 


£p{o\b,c)  =  Y 


We  further  must  recognize 


-KW  -  bsp^)-c^uZxo(X)o(U  +  X)sp^  -  I7)(tf)]: 


2<x2('k) 


i{u)  =  Y°(x)si(u~x) 


(6.28) 


(6.29) 


Combining  this  result  with  Equation  6.16  we  write 


Ci(o)  =  Y  d^U)  lnE  °(XMU  -  x )]  -  W)  +  M]  lnE  °(x)si(u  -  x )  +  M } 

U  X  X 

(6.30) 

This  gives  a  joint  log-likelihood  of 


_[dr(T)  -  &gp(T)  -  cE,  Ex  o{X)o{U  +  A>P(T  -  U)(n2 

4^  2o-2(T) 

+Yd*(u)  hY°^x>^u  -  X^ 

U  X 

-  \di(U)  +  M]  ln[ Y  o(X)si(U  -  X)  +  M\  (6.31) 


Next  we  take  the  derivative  of  this  log-likelihood  with  respect  to  o 
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where 


f)  R 

—  =  o(X  +  U)  +  o(X  -  U) 

GO 


(6.33) 


Below  are  a  few  definitions  to  simplify  this  equation 


dR 


Y  -q^sp(xV  -U)  =  c{[o*  sp](tt  +  x)  +  [o*  sp](4>  -  X)} 


u 


(6.34) 


<f>i(vh  +  A")  =  [o*sp](T  +  X) 

$2(^-X)  =  [0*Sp](tt-X) 

m  =  bspjty)  +  c[R0  *  Spljty) 
f[  ’  <r2(T) 

Using  the  above  definitions  (and  some  further  simplification)  we  are  able  to 
write  a  ML  algorithm 


(6.35) 

(6.36) 
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(X)  +  c  [%  *  4>t]  (X)  +  c  [i  »  42]  [X ) 


[1w  *  d  W  + c  \p  *  »■]  (A)  +  c  Ip  *  »=]  (X) 


(6.37) 


6-4-3  Stopping  the  Algorithm.  Image  recovery  techniques  built  in  a  manner 
similar  to  the  Richardson-Lucy  [30]  algorithm  are  often  run  for  a  set  number  of  iter¬ 
ations  and  then  exited.  It  would  be  better  to  have  an  optimized  method  of  exiting 
the  iterations.  It  is  fairly  well  known  that  iterating  beyond  an  optimal  point  can 
lead  to  noise  amplification  [36].  This  is  due  to  the  fact  that  all  Maximum  likelihood 
techniques  attempt  to  fit  the  data  as  closely  as  possible  given  the  constraints  of  the 
problem.  This  leads  to  the  question  of  when  do  we  stop  our  algorithm.  The  approach 
presented  here  involves  a  method  designed  to  damp  the  iteration,  thus  avoiding  noise 
amplification  regardless  of  how  long  the  algorithm  is  run.  This  means  one  can  iterate 
for  as  long  as  time  allows  with  some  assurance  that  iteration  n  +  1  will  not  be  worse 
than  iteration  n. 
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The  damping  routine  looks  at  the  statistics  of  each  data  set  and  compares  it 
to  the  models  statistics.  When  the  variance  of  the  model  approaches  the  variance  of 
the  data,  that  pixel  is  damped.  Stated  mathematically  for  the  image  plane  and  pupil 
plane  respectively 

K 


K~l 


£(4 


i)2  <  a*  a2 


(6.38) 


K 

K-^r-R)2<p*a2  (6.39) 

k= 1 

where  a2  and  a2  are  the  measured  variance  of  each  data  set  and  a  and  (3  are  user 
chosen  values  to  determine  the  degree  of  damping;  i  and  R  are  defined  as 


i(U)  =  Y  onew(X)Si(U  -  X )  (6.40) 


R{V)  =  Y  °new(X)onew{y  +  X)  (6.41) 

Using  the  above  criteria  we  can  create  two  binary  maps,  m,  and  mr,  that  will 
damp  the  iteration  for  the  pixels  where  the  criteria  are  satisfied.  The  values  of  the 
binary  maps  will  be  set  1  for  every  pixel  where  the  criteria  are  satisfied,  otherwise 
the  value  is  zero.  The  maps  are  updated  at  each  iteration.  The  maps  are  applied  to 
the  update  equation  so  that  where  ever  the  map  equals  1  the  data  and  the  model  are 
forced  to  be  equal,  essentially  stopping  that  pixel  from  changing  for  that  iteration. 
Using  these  criteria  we  restate  Equations  6.36  and  6.37 
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(6.43) 
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where  and  mr  are  the  binary  maps  applied  to  the  image  and  pupil  data  sets 
respectively. 

6. 5  Results 

The  simulated  data  sets  were  created  using  Matlab®  .  Each  set  consists  of  200 
independently  created  frames  of  both  pupil  plane  and  image  plane  data.  In  order 
to  create  the  frames  of  data  a  held  magnitude  was  defined  in  the  object  plane.  A 
random  phase  (uniform  on  (— mr  :  rm] )  was  added  to  every  point  in  the  held.  Each 
phase  is  independent  of  every  other  phase  in  the  held.  The  target  was  sampled  at  the 
Nyquist  rate  required  by  the  pupil  plane  aperture.  The  sample  rate,  A,  is  calculated 
according  to 

A  =  t (6.44) 

pupil 

where  A  is  the  optical  wavelength,  z  is  the  propagation  distance,  and  Dpupu  is  the 
diameter  of  the  aperture.  This  held  is  then  propagated  to  the  pupil  plane  using  a 
Fourier  transform.  It  can  be  shown  that  due  to  the  random  phase  in  the  target  held 
any  propagation,  regardless  of  propagation  length,  can  be  modeled  as  a  Fraunhoffer 
propagation  and  therefore  a  simple  Fourier  transform.  The  pupil  plane  aperture 
function  was  then  imposed  on  the  held;  after  taking  the  magnitude  squared  of  the 
held  and  adding  photon  and  read  noise  we  arrive  at  our  raw  pupil  plane  data.  The 
imaging  aperture  function  and  the  turbulence  phase  screen  was  then  applyed  to  the 
pupil  plane  held  and  another  Fourier  transform  was  performed  to  give  us  the  image 
plane  held;  the  magnitude  squared  gives  us  the  image  plane  intensity.  Photon  and 
read  noise  were  added  in  the  image  plane. 

After  the  data  sets  were  created,  three  image  recovery  algorithms  were  run:  1) 
deconvolution  [25],  2)Correlography  [21]  and  3)the  joint  algorithm.  For  each  algorithm 
the  raw  image  data  is  used  as  the  initial  estimate.  It  is  often  difficult  to  chose  metrics 
to  quantify  the  quality  of  a  recovered  image,  since  for  almost  any  metric  an  object 
estimate  can  be  chosen  that  is  much  better  or  worse  than  the  chosen  metric  implies. 
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Initially  mean  absolute  error  (MAE)  was  chosen  as  a  metric.  MAE  is  defined  as 


N,M 

MAE  =  (NM)-1  \°~°\  (6.45) 

n=l,m=l 


This  metric  was  quickly  discarded  as  the  errors  were  excessively  large  for  estimates 
that  were  shifted  with  respect  to  the  truth  object.  To  avoid  this  problem  we  choose 
a  slightly  more  complicated  metric  that  is  translation  invariant  [9] 


Er  =  min 


E  \°(u 


Uq,Vq 


Uq,V  ~  Vp)  ~  o(u,v)  |2 

EK«d)f 


which  is  more  easily  calculated  as 


(6.46) 


Er 


Oo(0,  0)  T  eoo(0,  0)  2  max  Uo,Vo  Re{r Oq(u0,  u0)} 

Too  (0,0) 


(6.47) 


where  rOQ  and  Tod  are  autocorrelations  of  the  object  and  the  estimate,  and  r0l 3  is  the 
cross  correlation.  An  estimator  will  seek  to  minimize  this  metric,  Er. 

The  truth  target,  seen  in  Figure  6.1  was  coherently  illuminated  and  the  reflected 
energy  was  propagated  along  a  turbulent  path  to  a  imaging  system  seen  in  Figure  6.2. 
The  measured  intensity  was  recorded  and  sent  to  the  image  recovery  algorithms. 


6.5.1  Baseline  Results.  Figure  6.3  show  the  results  of  the  joint  algorithm 
compared  to  the  results  of  the  deconvolution  algorithm  for  3500  iterations  with  R  = 
10.  As  predicted  the  joint  algorithm  give  better  results  at  every  iteration.  For  this 
comparison  each  algorithm  was  run  with  the  same  data  set  and  using  the  same  starting 
image.  Figure  6.4  show  the  recovered  image  from  each  algorithm;  clearly  the  joint 
algorithm  recovers  much  more  of  the  high  frequency  detail. 

To  fully  quantify  how  well  the  joint  algorithm  works  we  must  look  at  more  than 
just  a  single  data  set.  One  hundred  different  data  sets  were  created  and  the  joint 
algorithm  run  for  3500  iterations  on  each;  the  results  are  shown  in  Figure  6.5.  This 
graph  demonstrates  that  the  algorithm  results  have  a  reasonably  low  variance. 
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Figure  6.1: 
data 


(a)  Truth 


-r 


(b)  Raw  Data 

Truth  image  used  for  all  simulations  and  one  realization  of  raw  image 
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Figure  6.2:  Notional  system  architecture 
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Figure  6.3:  Plot  showing  how  the  results  of  the  joint  algorithm  compares  to  the 
results  from  deconvolution 
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Figure  6.4: 


(a)  Deconvolution 


(b)  Joint 

Recovered  image  from  each  algorithm  at  3500  iterations 
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Figure  6.5:  Results  joint  algorithm  run  on  100  data  sets.  The  mean  was  plotted 
with  error  bars  showing  one  standard  deviation  of  the  data. 

The  next  step  in  algorithm  validation  was  to  verify  that  the  damping  of  the 
algorithm  was  working  properly.  Figure  6.6  shows  Er  for  3500  iterations  of  both  the 
damped  and  the  undamped  joint  algorithm.  It  should  be  clear  that  without  damping 
a  very  accurate  stopping  criteria  would  be  needed.  The  damped  algorithm  removed 
the  need  for  a  stopping  criteria  at  the  expense  of  speed.  Both  algorithms  ultimately 
give  about  the  same  quality  of  recovered  image.  Figure  6.7  show  the  recovered  images 
at  varying  number  of  iterations;  it  is  important  to  note  that  the  image  is  very  stable 
and  very  little  change  occurs  between  3500  and  10000  iterations.  From  an  application 
stand  point  the  damped  algorithm  is  run  for  as  long  as  time  allows  without  the  fear 
of  over  iterating.  In  real  time  applications  the  undamped  algorithm  can  be  used  for 
speed,  but  with  the  risk  of  iterating  beyond  an  optimal  solution. 

In  order  to  compare  the  results  of  the  new  algorithm  to  existing  pupil  plane 
method  a  500  frame  data  set  was  simulated  and  processed  using  traditional  imaging 
correlography  techniques  [21].  The  results  of  these  simulations  are  shown  in  Figure  6.8. 
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Figure  6.6:  Plot  showing  how  the  results  of  the  undamped  joint  algorithm  compare 
to  the  results  of  the  damped  joint  algorithm 

Clearly  the  images  recovered  by  the  new  algorithm  have  less  speckle.  Correlography 
requires  many  more  frames  of  data  to  overcome  this  speckled  appearance,  ft  should 
also  be  noted  that  the  new  algorithm  does  not  require  a  human  in  the  loop;  correl¬ 
ography  techniques  require  the  removal  of  a  dc  term  (from  the  PSF  of  the  aperture) 
to  be  removed  from  the  data  prior  to  image  recovery,  while  the  new  algorithm  does 
this  automatically.  Further  the  new  algorithm  also  provides  a  repeatable  result  by 
damping  the  iterations. 

6.5.2  Varying  the  number  of  data  frames.  In  order  to  fully  understand 
algorithm  performance  it  is  necessary  to  show  results  as  a  function  of  the  number  of 
frames  of  data.  Figure  6.9a  shows  Er  as  a  function  of  the  number  of  frames.  There 
should  be  two  things  you  notice  from  this  plot:  (1)  Er  increases  as  the  number  of 
frames  decreases  and  (2)  the  damping  criteria  appears  to  break  down  for  low  frame 
counts.  The  fact  that  Er  increases  with  lower  frame  counts  should  come  as  no  surprise; 
our  model  is  based  on  the  limit  as  the  number  of  frames  increases  without  bound, 
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Figure  6.7:  Recovered  image  for  varying  number  of  iterations 
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Figure  6.8: 


(a)  1500  iterations 


(b)  10000  iterations 

Results  of  imaging  correlography  using  a  500  frame  data  set 
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(a)  (b) 

Figure  6.9:  Plots  showing  the  effects  of  varying  the  number  of  available  data  frames, 
(a)  shows  Er  as  a  function  of  frames  and  iterations  while  (b)  show  the  variance  of  the 
estimate  at  each  iteration  for  different  numbers  of  data  frames 

therefore  more  frames  gives  data  that  more  closely  matches  the  model.  Also  more 
frames  will  yield  a  higher  SNR.  As  for  the  damping  criteria  breaking  down,  the 
method  does  not  break  down,  but  rather  we  have  just  not  chosen  the  optimal  value 
for  the  damping.  It  was  hypothesized  earlier  in  the  chapter  that  the  choice  of  the 
damping  parameter  would  be  related  to  the  number  of  frames.  This  reinforces  that 
thought,  but  demonstrates  that  it  is  not  the  linear  relationship  used.  Figure  6.9b 
shows  the  variance  of  Er  as  a  function  of  frames.  Clearly  the  variance  will  rise  with 
fewer  frames  of  data.  The  reason  for  this  is  that  the  SNR  goes  down  as  the  number 
of  frames  goes  down. 

6.5.3  Effects  of  varying  the  strength  of  the  turbulence.  Since  the  algo¬ 
rithm  uses  both  pupil  and  image  plane  data,  turbulence  should  have  some  effects  on 
the  output,  but  not  as  much  as  a  it  would  for  a  image  plane  only  algorithm.  The 
joint  algorithm  was  run  for  four  different  turbulence  values;  the  results  are  shown 
in  Figure  6.10.  Stronger  turbulence  clearly  gives  a  higher  error  metric,  however  the 
difference  is  much  less  pronounced  than  for  deconvolution.  To  further  evaluate  the 
effects  of  turbulence  we  turn  to  a  subjective  look  at  reconstructed  images.  Looking 
at  Figure  6.11  we  can  see  that  the  effects  of  turbulence  are  present,  but  the  image 
degrades  slowly  with  increasing  turbulence. 
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Figure  6.10:  Plot  showing  how  the  results  of  the  joint  algorithm  are  impacted  by 
turbulence  strength 

6.5.4  Ability  to  recover  varied  intensities.  Figure  6.12  shows  the  results  of 
running  the  recovery  algorithm  on  a  gray  scale  image.  In  the  truth  image  the  center 
area  has  a  intensity  of  1  while  the  other  regions  (moving  clockwise  from  the  upper  left) 
have  intensities  of  0.1,  0.25,  0.5,  and  0.75.  The  algorithm  is  not  designed  to  recover 
absolute  radiometry,  but  it  is  important  to  be  able  to  recover  relative  intensity  values. 
The  outer  regions  in  the  recovered  image  have  intensities  of  10%,  22.6%,  44.1%  and 
70.8%  given  as  a  ratio  to  the  center  region  intensity.  Further  the  image  contains 
greater  than  90%  of  the  energy  in  the  field  of  view. 


6.5.5  Quantifying  system  resolution.  The  final  analysis  step  is  to  define  the 
system  resolution  in  a  manner  consistent  with  what  was  done  in  Chapters  4  and  5, 
which  was  to  image  progressively  smaller  bar  targets  and  define  the  system  resolution 
as  the  smallest  separation  that  could  be  resolved.  Recall  that  our  criteria  for  resolution 
is  only  three  bars  are  in  the  recovered  image  and  the  intensity  in  the  valleys  must  be 
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(a)  Truth  Object 


(b)  Recovered  image 


Figure  6.12:  Gray  Scale  recovered  images 


less  than  half  the  intensity  of  the  peak.  The  results  are  shown  in  Figure  6.13;  the 
joint  algorithm  was  able  to  resolve  bars  separated  by  only  three  pixels. 


6. 6  Conclusion 

The  joint  algorithm  proposed  here  provides  an  image  recovery  method  that  is 
less  affected  by  turbulence  than  deconvolution  and  suffers  from  fewer  stagnation  prob¬ 
lems  than  other  pupil  plane  algorithms.  The  algorithm  design  also  avoids  problems 
such  as  noise  amplification  and  image  decimation  commonly  associated  with  over 
iterating. 

This  work  was  performed  as  an  initial  step  in  implementing  a  high  resolution 
LADAR  system  using  multiple  data  sets.  In  the  simulations  for  this  paper  the  image 
plane  and  pupil  plane  data  sets  were  collected  through  identical  apertures;  this  does 
not  have  to  be  the  case.  A  major  benefit  to  this  algorithm  is  the  pupil  plane  data 
can  be  collected  using  a  conformal  array,  while  the  imaging  portion  is  made  small 
enough  to  be  installed  in  current  imaging  platforms.  This  has  the  potential  to  allow 
for  higher  resolution  images  without  greatly  increasing  system  volume.  Since  the 
changes  to  the  imaging  chain  are  only  software  changes  the  pupil  plane  array  could 
be  added  to  augment  existing  imaging  systems  rather  than  building  entirely  new. 
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(c)  Recovered  Image  (d)  Slice  through  the  recovered 

image 

Figure  6.13:  Quantification  of  the  resolution  of  the  joint  algorithm 
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VII.  Conclusion 


Three  image  recovery  algorithms  for  use  with  coherent  illumination  have  been  derived 
and  implemented.  The  first  was  a  minor  modification  to  the  stopping  criteria  of  a 
previous  work  [23],  while  the  other  two  are  new  work  entirely.  All  the  algorithms 
are  based  on  maximum  likelihood  methods.  Each  algorithm  was  evaluated  using 
Monte  Carlo  simulations.  The  improvement  of  the  joint  algorithm  over  the  image 
plane  (deconvolution)  algorithm  or  pupil  plane  algorithm  alone  will  provide  a  jump 
in  operational  capability.  The  design  of  the  algorithm  should  also  allow  for  easy 
hardware  implementation. 

7. 1  Summary  of  results 

In  the  previous  chapters  the  results  of  each  algorithm  were  analyzed  and  docu¬ 
mented.  The  analysis  for  each  included 

1.  Quantifying  the  resolution  of  each  algorithm  under  identical  conditions 

2.  Quantifying  the  ability  of  each  algorithm  to  recover  relative  radiometry 

3.  Showing  the  effects  of  turbulence  strength  on  the  recovered  images 

4.  Showing  the  effects  of  the  number  of  data  frames  on  the  recovered  images 

The  resolution  of  each  algorithm  was  defined  in  terms  of  pixels  using  simple  bar 
targets.  For  identical  aperture  sizes  and  turbulence  parameters  the  resolution  of  each 
system  is  10  pixels  for  deconvolution,  5  pixels  for  the  pupil  algorithm  and  3  pixels 
for  the  joint  algorithm.  Figure  7.1  shows  the  smallest  resolvable  bar  target  for  each 
algorithm.;  clearly  the  gained  resolution  is  significant. 

Both  deconvolution  and  the  joint  algorithm  were  able  to  recover  the  relative  ra¬ 
diometry  while  the  pupil  algorithm  was  not.  This  is  easily  explained  by  the  symmetry 
of  the  data  model  for  the  pupil  plane  data.  This  shows  the  importance  of  the  image 
plane  to  the  joint  algorithm. 

For  deconvolution  and  the  joint  algorithm  the  effects  of  turbulence  were  ana¬ 
lyzed.  In  both  cases  stronger  turbulence  gives  a  degraded  reconstruction,  but  the 
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(a)  Deconvolution 


(b)  Pupil  Algorithm 


(c)  Joint  Algorithm 


Figure  7.1:  Comparison  of  resolution  of  each  algorithm.  Each  sub  figure  shows  the 
smallest  resolvable  bar  target  for  each  algorithm 
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degradation  is  much  less  for  the  joint  algorithm.  Figure  7.2  shows  the  error  in  the 
reconstruction  for  varying  turbulence  strengths.  Only  in  the  case  of  very  week  tur¬ 
bulence  does  the  joint  algorithm  not  outperform  deconvolution.  Although  in  weak 


(a)  ^  =  20  (b)  -  =  10 

V  /  r0  v  '  r0 


(c)  2-  =  7.5  (d)  ^  =  5 

\  /  r0  v  1  rQ 


Figure  7.2:  Comparison  of  turbulence  effects  on  each  algorithm. 


turbulence  deconvolution  numerically  out  performs  the  joint  algorithm,  a  subjective 
look  at  two  recovered  images,  Figure  7.3,  shows  the  difference  is  negligible. 

Finally  in  all  cases  the  results  are  improved  dramatically  by  adding  more  frames 
of  data.  This  is  due  to  the  fact  that  both  image  plane  and  pupil  plane  data  models 
are  based  on  limit  functions  as  the  number  of  frames  goes  to  infinity. 
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(a)  Deconvolution 


Figure  7.3:  Comparison  of  images  in  weak  turbulence  =  5). 


7. 2  Significant  Contributions 

1.  A  new  more  robust  stopping  criteria  was  established  for  use  with  deconvolution 
algorithms.  The  stopping  criteria  is  based  on  the  statistics  of  the  data  and  the 
model  and  requires  no  a  priori  knowledge  of  the  object  being  imaged. 

2.  A  new  pupil  plane  imaging  algorithm  was  developed.  The  new  algorithm  has  a 
unique  stopping  criteria  and  can  deal  with  the  dc  term  imposed  on  the  data  by 
the  aperture  function  without  a  human  in  the  loop. 

3.  A  joint  image  and  pupil  plane  algorithm  was  developed.  This  algorithm  capital¬ 
izes  on  the  strengths  of  both  algorithms  and  has  been  shown  to  be  more  robust 
to  turbulence  than  the  other  algorithms.  The  joint  algorithm  consistently  pro¬ 
vides  better  results  (in  terms  of  our  chosen  metric)  than  the  other  algorithms 
except  in  the  case  of  extremely  weak  turbulence. 

4.  The  joint  algorithm  was  shown  to  be  much  less  sensitive  to  turbulence  than 
image  plane  algorithms.  This  was  shown  in  both  simulation  and  analytically 
(see  Appendix  B). 
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7.3  Recommended  Future  Work 

1.  Develop  an  optimized  rule  for  selecting  the  damping  parameter.  This  work 
has  demonstrated  some  relationship  between  the  damping  parameter  and  the 
number  of  frames,  but  it  is  not  optimized  and  breaks  down  at  low  frame  counts. 

2.  Use  a  pupil  array  larger  than  the  imaging  aperture.  The  original  intent  of 
this  effort  was  to  form  a  large  synthetic  array;  by  letting  the  pupil  array  be 
larger  than  the  imaging  aperture  it  may  be  possible  to  recover  additional  high 
frequency  details 

3.  Investigate  the  benefits  of  adding  a  phase  diversity  plane.  This  will  likely  yield 
little  or  no  improvement  since  it  will  be  blurred  by  turbulence  much  the  same 
as  the  focal  plane,  however  it  may  be  interesting  to  look  at. 

4.  Choose  a  better  maximization  algorithm.  This  algorithm  seeks  to  maximize  a 
likelihood  function  using  a  type  of  gradient  accent  similar  to  what  was  used  by 
Richardson  and  Lucy.  This  may  not  be  the  best  maximization  algorithm;  other 
methods  may  be  better  or  faster. 

7.4  Possible  Applications  of  this  Research 

There  are  three  primary  application  areas  for  this  research 

1.  Augment  existing  imaging  systems  by  adding  a  pupil  plane  detection  capability 

2.  Build  light  weight  imaging  systems  using  conformal  arrays 

3.  Implement  synthetic  aperture  ladar  systems  for  space  based  applications 
Each  of  these  application  areas  will  be  discussed  briefly. 

7.4-1  Augmentation  of  existing  systems.  Since  the  addition  of  the  pupil 
plane  data  does  not  affect  the  image  plane  data,  this  algorithm  could  be  applied 
to  existing  imaging  systems  by  simply  adding  a  pupil  plane  detector.  This  is  an 
important  consideration  in  terms  of  development  cost  and  schedule.  For  example  a 
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space  observation  telescope  (like  the  Starfire  Optical  range)  that  already  images  space 
objects  through  the  atmosphere  could  add  a  beam  splitter  in  the  optical  path  and  re- 
image  the  pupil  plane  and  collect  the  pupil  data  without  interfering  with  the  imaging 
path  (other  than  refocusing  the  system).  This  would  allow  for  higher  resolution 
imagery  at  a  lower  cost  than  a  full  system  redesign.  One  implied  assumption  is  that 
the  telescope  is  imaging  laser  illuminated  objects.  This  gives  an  upgraded  capability 
for  Space  Situational  Awareness  (SSA). 

7-4-2  Co7iformal  Arrays.  Many  of  the  Air  Forces  ISR  assets  are  being  flown 
on  Unmanned  Aerial  Systems  (UAS).  Since  these  systems  are  small  (as  compared  to 
more  traditional  collection  platforms),  there  is  a  need  for  the  optical  systems  to  be 
light  weight.  This  precludes  the  use  of  traditional  large  aperture  imaging  systems. 
Also  ISR  assets  have  to  deal  with  turbulence  so  some  mitigation  is  needed.  A  con¬ 
formal  array  to  collect  pupil  plane  data,  which  (like  the  above  scenario)  could  be 
used  to  augment  the  existing  imager,  could  be  added  to  the  skin  of  the  aircraft.  This 
gives  a  larger  (turbulence  resistant)  aperture  capability  will  very  little  added  weight 
or  volume. 

7-4-3  Synthetic  aperture  LADAR.  The  final  application  area  to  be  discussed 
is  synthetic  aperture  LADAR.  This  is  a  challenging  problem  due  to  the  difficulty  of 
measuring  the  phase  in  optical  systems.  The  algorithm  proposed  by  this  research 
avoids  that  problem  by  implementing  a  pupil  plane  recovery  algorithm  that  does  not 
require  a  phase  measurement.  An  array  of  sub-apertures  could  be  formed  to  collect 
pupil  intensity  and  this  intensity  used  along  with  a  low  resolution  image.  The  real 
benefit  of  synthetic  aperture  ladar  would  be  in  building  space  based  systems  where 
high  resolution  is  needed,  but  it  is  not  feasible  to  use  large  monolithic  apertures. 


Appendix  A.  Important  Proofs 


A .  1  Introduction 

This  appendix  is  intended  to  provided  detailed  proofs  of  critical  equations. 


A. 2  Equation  6.6 


Prove: 

j 

jim  j-iy'iJjn(or=6|ft(or+<Ji(o*ifc(or  (a.i> 

J — »oo  z — * 

n= 1 


where  £  is  a  two  dimensional  variable  in  the  object  plane. 


Proof: 


Given:  i?n(£)  is  a  single  frame  of  speckled  autocorrelation  data,  |/r(£)|2  is  the 
point  spread  function,  b  and  c  are  constants  and  i?(£)  is  the  true  autocorrelation  of 
the  obejct  brightness  function. 

It  is  helpful  to  notice  that  the  left  side  of  Equation  A.l  is  an  ensemble  average 
of  speckled  object  autocorrelations.  Since  that  phase  associated  with  the  surface 
roughness  of  the  target  is  the  only  random  part,  and  it  changes  each  coherence  time, 
we  can  rewrite  this  ensemble  average  as  a  time  average  (assuming  we  average  over 
many  coherence  times). 


=  6IMOI2  +  CJS({)  *  |/!.({)|2  (A.2) 

The  speckled  autocorrelation,  Rn  can  be  written  as 

Rn(Z)  =  [fn(Z,t)*fn{U)]*h(Z)  (A.3) 

where  *  and  *  represent  correlation  and  convolution  respectively,  and  /n(£,t)  = 
is  the  object  held  distribution.  By  applying  the  autocorrelation  theorem 
this  can  be  rewritten  as 


(A.4) 


RnP)  =  F~l  {\P  {fnPA)}\2H{x)} 


By  plugging  Equation  A. 4  in  to  the  left  side  of  Equation  A. 2  we  get 
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First  we  will  expand  the  inner  magnitude  squared  term 
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where  k  —  yl.  Next  we  take  an  inverse  Fourier  transform 
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Now  taking  the  magnitude  squared  of  this  value  we  arrive  at 
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H(x1)H%x2)ejkt{xi-X2)dZ1d&d£ld&dx1dx2  (A. 8) 


Taking  the  time  average  yields 
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Looking  at  just  the  time  dependent  piece  it  we  notice  that 


^d<A(6t)-<M6,d-<dClt)+ACLd]^  =  ^  6  _  Q  +  «$(£  _  £  _  Q  (A. 10) 
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where  8  is  a  Dirac  delta  function.  Taking  advantage  of  the  sifting  property  of  the 
Dirac  delta  function,  we  now  can  write 


H(x1)H*(x2)ejk^xl-X2)d^2dx1d^.ll) 


Next  we  make  the  following  substitutions 


U  =  X 1  —  x2 

(A. 12) 

dx  2  =  —du 

(A. 13) 

-  0  t- 

(A. 14) 

dt;  2  =  —  dw 

(A. 15) 

which  allows  us  to  write 
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Since  H  is  the  pupil  function,  we  know  (from  Fourier  Optics)  that  the  autocorrelation 
of  H  is  the  Optical  Transfer  Function  (OTF),  7i,  defined  as 


7i(u) 


H(x\)H*(xi  —  u)dx  i 


(A. 17) 


and  the  inverse  Fourier  transform  of  the  OTF  is  the  point  spread  function  (PSF) 
times  a  constant 

F-1  {H(u)}  =  K\h(£)\2  (A. 18) 
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Using  this  relationship  and  taking  advantage  of  the  symmetry  of  the  PSF 
write 


r*  OO  P  oo 


<1  Rn(0\2)  =  b\H0\2  +  /  /  «2(^i)a2(6  -  w)dCiK\h(w  -  t)\2dw 
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Finally  we  must  recognize  that 


R(w)  =  a  (fi )a  (6  -  w)d£i 


is  the  autocorrelation  of  the  object  intensity.  Using  this  we  can  arrive  at 


<l^n(0|2>  =  b\h(t)\2  +  c  /  R(w)\h(w  -i)\2dw 


which  is  simply 


(i^K)r>=6iMor+cfl«)»iA(oi2 
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Appendix  B.  Algorithm  insensitivity  to  atmosphere 

B.l  Problem  Statement 

Given  a  field  at  a  laser  illuminated  target,  g( X)  =  a( X)e^^x,t\  where  a  is  the 
amplitude  of  the  held  and  0  is  a  random  phase  distributed  uniformly  on  (— 7r,  7 r]  that 
is  statistically  independent  in  both  space  and  time.  Now  add  a  Kolmogorov  phase 
screen  at  the  target,  9( X,  t ),  and  aperture,  "0(A),  plane.  Prove  that  the  time  average  of 
the  modulus  squared  of  the  Inverse  fourier  transform  of  the  intensity  at  the  receiving 
aperture  in  the  Fraunhofer  region  is  unaffected  by  these  two  phase  screens. 


B.2  Solution 

The  held  in  the  aperture  plane  is  found  by  taking  a  Fourier  Transform  of  the 
held  in  the  target  plane: 


/OO 

a  (X)ej^x’t)  e~  % (XA)  dX  (B.l) 

-OO 

where  9  is  the  phase  imparted  by  the  phase  screen  at  the  target,  adding  the  phase 
screen  at  the  aperture  and  taking  the  magnitude  squared  of  this  quantity  yields 


e-£l(xi-x2)MdX1dX2  ■  ■  ■ 


(B.2) 


where  0  is  the  phase  screen  in  the  aperture.  It  should  be  clear  that  it  has  no  influence 
in  the  result  since 

ej(il>(A,t)-i/>(A,t))  _  ^  (B.3) 

Equation  B.2  represents  a  short  exposure  intensity  in  the  aperture  plane.  The  next 
step  in  the  proof  is  to  take  the  inverse  Fourier  transform  of  this  quantity  and  then 
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We  were  given  that  (j)  is  uniform  on  (— vr,  7r]  and  we  know  the  phase  screens  have  joint 
Gaussian  statistics  and  are  defined  for  all  values  of  phase.  This  makes  it  necessary 
to  redefine  the  density  of  cj)  to  cover  all  phase  values.  We  can  simplify  a  little  by 
recognizing  that  the  area  under  a  Gaussian  curve  becomes  very  small  in  the  tails;  this 
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allows  us  to  define  the  uniform  density  function  over  a  finite  range  of  phase 


V'M  1,02)  = 


2nn 


(B.7) 


where  <p\  G  (— mr,  mr]  and  <\>2  G  (— mr,  mr]  where  n  is  an  integer.  The  value  of  n  is 
chosen  such  that 

/nn  rnn 

/  p0(9  i,02)«l  (B.8) 


'  —nn  ’J  —nn 


By  plugging  eq(B.7)  into  eq(B.6)  it  can  be  shown  that 


rnn  pnn 


Pz{z1,Z2)  = 


2mr 


Pe(zi  -  <pi,  z2  -  <t>2)dlj>idlj>2 


(B.9) 


'  —nn  J  —nn 


If  we  plug  in  Q\  =  z\  —  4>i,  92  =  z2  —  4>2,  d&i  =  —d6\,  and  d(j)2  =  —  d02  we  can  write 


pnn  rnn 


Pz(zi,Z2)  = 


2mr 


Po(0i,02)d0id02 


(B.10) 


— nn  J  —nn 


which  gives  us 


Pz(z1,z2 ) 


2mr 


(B.ll) 


which  is  clearly  uniform  and  uncorrelated. 
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Appendix  C.  Investigation  of  the  spatial  independence  of  speckle  noise 

For  speckle  noise  to  be  truly  independent  from  pixel  to  pixel  the  average  speckle  size 
must  be  smaller  than  a  pixel.  We  have  to  test  this  assumption  in  both  the  image 
plane  and  the  pupil  plane.  We  will  look  first  at  the  pupil  plane. 

Dr.  Joseph  Goodman  developed  a  measure  for  the  average  size  of  a  speckle  lobe 
in  any  non-imaged  plane  [18].  By  assuming  the  object  has  uniform  brightness  one  can 
find  the  average  area  of  a  speckle  lobe  by 


Sc 


AV 

A 


(C.l) 


where  A  is  the  wavelength  of  the  optical  radiation,  z  is  the  propagation  length,  and  A 
is  the  area  of  the  target.  If  we  lift  the  uniform  brightness  restriction  the  expression  is 

=  \2  2  f-oof-ool2(U’V)dudv 
Z  [ff°ooff°ooI(u,v)dudv\2 

The  above  equations  give  an  expression  for  the  area  of  a  speckle  lobe.  There  is  no 
expression  for  the  linear  dimensions  of  a  speckle  lobe,  but  taking  the  square  root  of 
the  area  will  yield  a  good  approximation. 

The  speckle  size  in  the  image  plane  is  found  by  assuming  the  pupil  aperture  is 
a  uniformly  bright  source  and  using  the  image  distance  as  the  propagation  distance. 

For  the  geometry  used  in  this  research,  the  speckle  correlation  extends  over  a 
few  pixels  (3-5),  so  the  assumption  of  independence  is  not  strictly  true,  but  does  not 
depart  from  it  significantly. 
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